71

I'm a sociology undegrad working on an essay for a methods class. I'm also planning on submitting it as a sample for my application to grad school. I don't want to be too specific, but I believe that this work is quite original and my hypothesis would confirm previous literature, and all in all I think it would would make a good impression on the admissions committee.

So basically I've run the tests and I'm getting conflicting results. Using one dataset (which has more observations) gives me very significant results, while using another one (which would arguably be more accurate) doesn't give me anything. So here I am at a crossroads, and I've come up with three possible options as to what to do:

  1. Only show the significant results. After all, this is just a ten-page essay, it's not supposed to be publishable or anything, right?

  2. Only use the better dataset and admit that there just isn't much there - maybe blaming it on the small sample size or on the not-so-good dependent variable. Hopefully the committee would appreciate the honesty and the relatively advanced methods that I used.

  3. Show results from both datasets, suggesting that the differences might be due to the sample size or maybe to chance.

As I type this I'm leaning more towards option 3, but I'd like to hear from people with more experience in academia. What should I do?

undergrad_dilemma
  • 631
  • 1
  • 5
  • 4
  • 86
    Contradictory results are the first step towards a discovery. – henning Dec 10 '18 at 17:21
  • 59
    @henning ...or a debunking of scientific credos. Embrace the contradiction. – Captain Emacs Dec 10 '18 at 17:23
  • 27
    "this work is quite original and my hypothesis would confirm previous literature" It confirms existing previous results, but it's original? – Acccumulation Dec 10 '18 at 18:23
  • 7
    +1 for asking. I strongly recommend you visit Andrew Gelman;s blog regularly for discussions of the proper way to do statistics, particularly in the social sciences, Here;s one example https://andrewgelman.com/?s=file+drawer – Ethan Bolker Dec 10 '18 at 18:39
  • 4
    Turn the question around. Don't ask "how honest should I be?" Ask "how hard should I attempt to deceive my reviewers?" Is the answer to the question more straightforward when you ask it that way? – Eric Lippert Dec 10 '18 at 22:28
  • 3
    Can you get a third dataset? – Headcrab Dec 11 '18 at 08:15
  • 9
    Everybody seems to agree, but then bizarrely you have so many papers published with amazing results on hand-picked datasets that nobody can reproduce on any other dataset :-) – jcaron Dec 11 '18 at 09:47
  • [...] After all, just an essay. 2) [...] appreciate the honesty. 3) [...] suggesting that [...] - Lots of cutting there, sorry, but I find your dynamic take on 'honesty' quite fascinating given how your question starts out.
  • – notjustme Dec 11 '18 at 16:25
  • 6
    Once I saw the words "how honest" and "disclosing results", I knew the answer to your question would be "completely honest." – JaS Dec 11 '18 at 18:33
  • I actually just listened to a podcast from the lovely folks at SYSK that hit on this exact topic. A huge problem with research is that only the "sexy" results (their descriptor, not mine) tend to get reported and it leads to misleading/outright false information (for those interested: https://www.stuffyoushouldknow.com/podcasts/research-tips-from-sysk.htm) – Broots Waymb Dec 11 '18 at 18:56
  • @jcaron that's the powerlessness of the 'ought'. Unfortunately, publication bias is a reality. – henning Dec 12 '18 at 09:06
  • 1
    There is a third option: bad methodology -- either in collecting the data or in deciding what data to collect. – Marxos Dec 12 '18 at 18:07
  • Welcome to science! – Ratbert Dec 12 '18 at 19:31
  • You're forgetting the scientific adage that any theory can be considered proven if doing so involves throwing out fewer than half of your observations. (I'm being sarcastic.) – Mark Meuer Dec 14 '18 at 16:37
  • 2
    I want to make sure I understand. Your essay covers two experiments. The first had a larger data set and turned out significant results, but its variables are lackluster. The second study has more interesting variables but the study was too small to turn out a significant result. Is that correct? Sounds like exactly what I would want to happen in order to justify repeating the second experiment with a larger study. Would that not be the obvious conclusion for the essay? – John Wu Dec 16 '18 at 09:25