How do you benchmark against another paper's result if their code is not released, and my replication of their method obtain different results?

Question

I've developed an algorithm that I believe should theoretically improves upon a previous work. In the previous work, the authors have provided results for a baseline method and their own method. The authors however did not release any codes for the paper. As this work is in machine learning domain, I understand that it could be difficult to replicate their method's results due to a variety of reasons. However, the baseline method they've used is very simple and I am certain that I'm able to implement it correctly.

After running some experiments, it turns out that my method's result were not better than the numbers reported for their method. However my method was able to improve upon my own baseline numbers by a greater amount than their method against their baseline numbers. The problem lies in that my replication of the baseline results is different (worse) than their reported numbers for the baseline.

Now I am considering a few options for benchmarking my work:

Ignore their reported results and implement their method by myself. Compare my method against my own implementation of their method
Compare the difference-of-difference, i.e. how much my method improves upon my baseline numbers versus how much their method imporves upon their reported baseline numbers
Just mention that I am unable to replicate their results due to lack of code, and present my method's results only

What would be the best option for benchmarking my method?

Have you tried contacting the authors of the prior paper to ask if they can share their code / data with you? — Dimitri Vulis, Oct 28 '22 at 12:23
I've emailed and hopefully they may share the code. However in the event that I don't get any reply, I would still want to seek advice on how to benchmark if possible — Tianxun Zhou, Oct 28 '22 at 13:56
I think deciding among your options 1/2/3 fall into "content of research" and isn't really appropriate for this SE site. What does your advisor think? — Bryan Krause, Oct 28 '22 at 15:12

score 4 · Accepted Answer · edited Oct 31 '22 at 04:44

In this scenario, if you don't compare your results to "their" results at all, then others will. If they they read yor paper and discover that, as you say, your results look no better than their results, then you will look bad. You should disclose this with the same common sense approach that you used in your question. I.e.

It goes without saying that you should make your code / data / model publicly available on github or some similar platform, so that at least your results can be reproduced by others.
Do cite their paper.
Provide all 5 results: from your base implementation, from their base implementation that you cite, from their implementation of their approach, from your re-implementation of their approach, and from your implementation of your approach.
Focus on the analysis of why your approach is better than (your understanding of) the base approach. Do explain briefly but clearly how your novel approach differs from their approach. It would also be nice, but not absolutely necessary, to analyze why your approach is better than their approach.

Thank you for the suggestions. That was very helpful advice! — Tianxun Zhou, Oct 29 '22 at 12:11

How do you benchmark against another paper's result if their code is not released, and my replication of their method obtain different results?

1 Answers1