Dealing with 'less than optimal' data or study designs

Question

I am a PhD student in a quantitative social science field. For the first study of my dissertation, I was given the opportunity to ask some questions that could be applied to an ongoing survey. Affiliated researchers have already used some of this data for a short commentary/descriptive article, and there is interest to apply the data in a slightly different 'modelling' study.

However, the underlying study design leaves much to be desired. Although the sample is relatively large, it is a convenience sample from only a fraction of all possible survey sites in the population. There is no opportunity to expand the data collection process. The data itself is unlikely to have measurement error, but almost any estimate produced from the dataset is highly likely to suffer from sampling bias.

Nevertheless, I realize this is a somewhat 'common' situation graduate student find themselves in. There is active interest from the affiliated partners to carry out the study, the design isn't 'ideal', and there are logistical considerations of invested time (or as I see it, sunk cost fallacy).

While I have many questions, I think the most straightforward is: should I attempt to make 'lemonade out of lemons' or 'stand my ground' that such a modelling study would be inappropriate?

Other potential information:

Supervisor is 'ardent' that such a study should be carried out
I have sought out external input as to what methods would be 'most' appropriate, there are some that might be 'less' worrisome than others

You don't say anything here about the cost in money and time that would be required to do it right. How big a bind is it? — Buffy, Feb 05 '20 at 01:11
It won't compromise your integrity if you state clearly the limitations of your conclusions. And satisfying your advisor is a plus, all else equal. — Aaron Brick, Feb 05 '20 at 01:26
@Buffy that is a fair point. There would be minimal cost, but likely significant logistics to work out. Not impossible, but tens of other out-of-network sampling points would have to be contacted, recruited, IRB'd, and data collected. Not impossible, but perhaps 2-3 years of logistics work. -Aaron Brick this is true, although I worry that simply stating limitations isn't 'enough' in the sense that this seems to be a scenario where the limitations are more severe than 'usual'. — , Feb 05 '20 at 02:13
@Buffy just wanted to follow-up on your comment (and might be useful for future users). In the end I voiced my concerns to my advisor and the project was put on temporary hold. However, I then attended a somewhat random talk and learned that the necessary data was indeed out there! Will take some time, but should be able to do it right. Future users - don't give up! And the search for the 'right' data may take a few weeks. — , Feb 12 '20 at 03:16

Dealing with 'less than optimal' data or study designs

0 Answers0

Linked