fredag 29 november 2013

Pre-Theme 4: Quantitative research


Information sharing on social media sites

B. Osatuyi, from the journal Computers in Human Behavior, Impact Factor: 2.067.

 

Problem

 Credibility with information shared on social media sites.

Research design - quantitative methods

Data was collected using a web based survey with e-mail invitations. It was of exploratory type. 200 university students were e-mailed a link to the survey. Only 114 students gave complete answers, even though all participating students were rewarded extra points for their grades.

Five social media technologies were included: 
  1. Social networking
  2. Micro-blogs
  3. Wikis
  4. Forums
  5. Blogs
The definition of each information type explored in the paper is backed up by references to other studies.
The study used the following categories for information: 
  • Personal (sensitive); health, relations
  • Sensational; news, gossip, science
  • Political; political discussions
  • Casual; restaurant reviews etc
Chi-square test was used to analyze the results regarding which media technology is used to share information. The appropriateness of this method is explained by the ability of estimating if two variables are independent. The Chi-square test needs an estimated value to be compared with the observed value. I did not find any such estimations in this paper.
The study also introduces a classification code that the participants may use to indicate credibility of the shared information on social media. It's on a scale between 1 and 14 and contains fourteen combinations of the following credibility indicators: 
  • Link to other sources
  • Topic of interest
  • Embedded video
  • Embedded audio
However, it does not include the familiarity factor. My immediate thought is that familiarity with the person posting the information will affect the credibility quite a bit. I think they could have referenced such study. I found one research paper from University of Alabama called " Factors and effects of information credibility" that could've been relevant. However that type of research might be of more qualitative nature.

Above methods were used to reach the answer to their first question, "Does the codification of information credibility vary across different social media sites?"

The second to fifth questions were about if information shared varies on different social media. The variance of the answers (ANOVA) seem to have been analysed by calculating the degrees offreedom.
The analysis methods for these questions were the same. An example: the result of the first question in this series was that there was no statistical difference on which social media site people share personal information. I find myself at loss at criticizing the method since I lack the proper knowledge about it.


Conclusion and What did I find and learn?

The conclusion from this survey was that there is a difference between how people share information on social networking sites compared to other social media sites. It also finds that people are careful with what personal information they share. The researchers argue that organizations may use this to better engage their customers.
I think that they could have found a way to contact a larger initial group of potential participants since not many answer surveys. From personal experience, I will often not participate if not interested. Maybe this is a weakness of the method, only very motivated people will participate. 
Though they seem appropriate, for me it was a bit difficult to analyze the statistical methods due to limited knowledge of statistics. I had to look up the basics of ANOVA, Chi-squared test and degrees-of-freedom, but would still benefit from attending a statistics course.





Physical activity, stress, and self-reported upper respiratory tract infection  

Bälter et al.

Problem and Background

Many seek medical attention due to common cold and influenza (URTI).  The research focused on investigating the relation between URTI, physical activity and stress.
Previous studies of similar nature have been more qualitative and focused on smaller groups of athletes.

Research design

The study used a population-based prospective cohort method where about 1500 random participants were selected.
For the quantitative data collection, it used an adaptive web questionnaire, with E-mail reminders. It summarized the collected data from various questions, about health and lifestyle, automatically into points on MET-hour and Perceived Stress-scales, where MET is the rate of energy consumption.
The MET-hours were multiplied with reported hours spent on certain activities by questionnaire participants. There's a clear definition of how stress was measured with the PSS-scale.
Follow up questions on influenza-vaccine and allergies to rule out symptoms not related to URTI at the end of the data collection phase.
A clear diagram shows selection and filtering of random participants in for study. Nice tables are presented of the collected multi-variable data. Incomplete results were not used.

Discussion and Conclusion

The conclusion from the analysis was that physical activity lowers the risk of URTI for both sexes. It also states that highly stressed men benefit the most from physical activity. The study could confirm findings from other similar research regarding the relation between physical activity and URTI, except for one study example where the test group consisted of professional athletes. It could however not clearly confirm an overall relation between stress and URTI despite attempts to exclude/select parts of the data to make it fit previous findings. Though there was apparently a stronger relation between stress and URTI in men, perhaps because the way men react to stressful situations.
I think that they satisfactory discuss weak points in the data and explain why it probably does not affect the overall result. The conclusions can be useful in further studies, and as a resource in informing the public about exercise, illness and stress. I think that this was an excellent report and the scope is wide enough, I wouldn't change anything really.

Which are the benefits and limitations of using quantitative methods?
From reading the research by Bälter, it seems that it's easier to get a result that represents the majority if a large and random group is studied. If the subject group is too small, the result may not be representative.  Also, occurrence of  phenomena may be missed out due to the focus on testing a hypothesis rather than producing one.
Which are the benefits and limitations of using qualitative methods?
Since the qualitative study selects a smaller group to study,  it can give a clear and detailed description on an individual and specific level. It can, as opposed to a quantitative study, react and change focus based on findings during the study. However, collection of data might be more difficult and takes more time, the results might not be applicable to other groups, and the result could be influenced by the researchers subjectivity.

References
University of South Alabama 2013,

onsdag 27 november 2013

Post-Theme 3: Research and theory

So, this week we've been talking about what theory is and is not. I think that I managed to summarise it nicely in the Pre-Theme3-post after reading Gregor's The Nature of Theory in Information Systems and Sutton's What Theory is Not


What people in general call a theory might not be a theory at all


Even though I regard it as relevant, I know that the paper I chose might seem like a quite odd bird in the Media Technology forest. I think it has to do with my background and interest in electronics, since I actually have studied electronics and computer systems for three years at KTH and worked within IT and later electronics manufacturing sector for a number of years before adding the Media-layer on top of that. My argument that media storage is an important part of Media Technology might be a little bit like saying that knowing a lot about paper quality is important to be able to print good news. But even though it might not be rock solid, I think it still holds. And we did add both the paper to the example papers and the theory of the paper to the examples of theories during the seminar. The theory of this paper was of a Design and action type (Gregor) which seems quite uncommon for Media Technology research papers, at least of what I've seen so far. That's why we chose to add it, we called it "Theory of Information Storage".


Anyway, during the seminar we discussed what theory is and some added and modified the "What is theory" page. I think that there were some nice ideas about it but still, it's always possible to question any definition, perhaps something that we've learnt from reading Russell and Plato. For instance, there was a kind of consensus in the seminar group that a theory needed consensus as well. But, was it enough if a team of experts, an esoteric consensus if you will, or would the understanding and confirmation of the construction of a theory need support from other groups as well?





Most Media Technology research seems to have a lot to do with psychology and social interaction, and the most common theories selected so far seem to be of the Analysis or Explanation types. That's all good and useful in many cases, however I like when research can actually lead to some action of improvement as well. And of course, another research could be based on the one that used the Analysis theory, use it as a reference and then perhaps apply a theory that is more of Design and action.


I feel now that I've got a better understanding of what can be called theory in a research paper. That the findings can only be part of the theory if explained how and why. Though, it's going to take a few more research papers to even better understand the different variants, and it's clear now that writing a decent research paper requires adequate understanding of what theory is and how it should be presented.



fredag 22 november 2013

Pre-Theme 3: Research and theory



Journal


I found the journal Nature.com from the publisher Nature Publishing Group. It has an impressive impact factor of 38.597.

The main categories, chemistry, clinical, environment, life sciences and physics of this journal do not qualify as relevant to media technology. However, I found a very interesting article on applied genetics that converges into an important part of media technology, thus I find the journal relevant.


Paper


I've chosen a paper with the title Towards practical, high-capacity, low-maintenance information storage in synthesized DNA, found in the Nature journal. It is a multidisciplinary study on how synthetic DNA can be used as a bearer of digital information. The hypothesis is that synthesized DNA would be suitable for carrying large amounts of digital data, and for long term storage.


Problem and background


The main problem is described as the great increase in production of digital media leading to an increasingly complex and maintenance intensive task of archiving the same.



It is already known that DNA stores genetic information, and the technology to sequence synthetic DNA is available. Storing data in DNA has been done before, the main problem seems to be to synthesize long sequences of DNA to an exact design. I think that there is a very logical argument here, for if DNA cannot store an exact representation of binary data, then the data stored will become corrupt.


Research design


This research is applied since it clearly utilizes known technology and previous studies in DNA and data storage. It's a proof-of-concept by selecting a number of digital media files such as texts, audio and picture to be stored in DNA. I find this an excellent method.

The generated DNA was shipped from USA to Germany without any special packaging to prove the durability and it's long term storage capabilities. One flaw here is that there is no data about the environment during transport and how it would or would not affect the DNA.


It is explained that a Illumina HiSeq 2000 system was used to re-read the DNA in order to decode it. It's unclear which equipment produced the DNA.





Findings, discussion and conclusions

Encoding of data files into DNA is clearly explained with comprehensible diagrams. It is also visualized how the DNA then is stored in DNA fragments and how every second fragment is a reverse complement of the previous for redundancy.


Not knowing much about genetics, I could still comprehend the research document. One should understand the concept of base calling in DNA research, the design phase in sequencing.


The method used to overcome the difficulty in sequencing long strings of DNA without error was to produce segments of the same information that overlaps the other segments of the same information, creating redundancy. The methods seems relevant to the problem.


Diagrams are presented for efficiency, base error and relative cost.

While the efficiency diagram clearly explains the relation between amount of data and current synthesis costs, it is unclear how the projected future costs have been obtained.


The base error data is obtained both from the empirical analysis of 5 shipped DNAs and a theoretical model. One of the DNAs contained errors but was manually repaired and used anyway.

The samples match the theoretical plot.


Regarding cost effectiveness, The study gives an application example where CERN currently has 80PB of data, produces 15PB each year while only 10% can be stored on disk. The capacity problem seems possible to solve with the DNA storage.


It is proposed that DNA storage could be cost effective with a breathtaking price tag of $12,400 per MB! A quick look reveals the average cost per MB in 2013 is $0.05. DNA would be 248000 times more expensive than HDDs. The study compares relative cost of long term storage. Since traditional disks needs to transfer data in order to keep the information intact, the argument becomes slightly more relevant.


The conclusions are that DNA storage could be feasible for archiving huge amounts of data while being less expensive over time compared to traditional storage.

Weak points:  assumptions of future cost for DNA synthesis, DNA read and write time has been completely overlooked, and is way too long.


Future research


The results are relevant for further research of new storage technology, but currently not of practical use. Reducing access time and a report on how DNA synthesis can become cheaper should be the next steps.

---



What theory Is


Theory is a way of thinking systematically in order to accumulate knowledge.
A theory can be constructed with statements backed up by facts.
There should be consensus in the construction of the theory.
Theories can be used to analyze, explain, enlighten about and predict concepts.


What it is Not


Collected data, hypotheses and references to other works. These are components in a theory but are not theories by themselves.


Theories in my selected paper


My selected paper is a mix of prediction & Design and action theories. This is because it describes the current situation, proposes a future solution and at the same time at least partially explains how to achieve the solution.


Benefits and limitations of using the selected theory or theories


The benefits of using these two theories is that there is a concrete problem to be solved, it can easily be described and there are available technologies that can be developed further to become a solution to the problem. Using the selected theories makes it easy for other researchers to continue where this paper came to its conclusion.