All of us at ContentMine are incredibly proud to have been funded by the Wikimedia Foundation to work on the WikiFactMine Project.
This thread is here to help get the discussion started and to make sure our thoughts and plans are easy to see rather than falling into a massive email chain.
Some great suggestions have already been made by Daniel Mietchen on github.
This post is inspired by the BibTeX from Wikidata functionality described in https://larsgw.blogspot.de/2016/09/citationjs-on-command-line.html .
Some thoughts on how to integrate ContentMine's paper metadata handling with Wikidata:
if a ContentMine pipeline (or any reference file in BibTeX or similar format, for that matter) touches bibliographic metadata of scholarly articles, check whether Wikidata items for these articles already exist (e.g. via P932, P356, P698).
if yes, it might simply trigger an integrity check of these metadata, perhaps identify the main topic (P921) or do nothing for the moment
if no, it should start the missing items with at least some basic properties (e.g. P31:Q13442814 and the respective value for a persistent identifier). If this would leave the items incomplete with respect to Wikidata's data model for scholarly articles, the missing pieces could be handled by the mostly existing pipelines around constraint violations.
in addition to the existing ContentMine pipelines to search by dictionaries, it might be interesting to have some functionality to search the literature (across all or selected dictionaries) by contributions from particular authors, institutions, journals, dates or some such, with which Wikidata could help
What about running ContentMine over Wikipedia dumps to identify facts?
if these facts are referenced on Wikipedia to scholarly sources, ContentMine could check whether the indicated sources actually support the statement, and flag cases where that's not clear
if the Wikipedia statements lack scholarly references, ContentMine might be able to find some
as above, the metadata of the scholarly references would go to Wikidata, from where it might be pulled into the respective Wikipedia article by way of some variant of Module:Cite.
If you have any ideas or comments then why not reply below or, if it's a big complex idea, start a new discourse thread.