I have used CrossRef to extract metadata for a complete week of scholarship, starting on 20160601 until 20160607. I did this by manually constructing the CrossRef RESTful URLs (API) although since then it can be done better by getpapers
. This gets metadata for "everything" - books, "components", journal-articles, etc.
The coverage depends on what "publishers" submit. It is very probably weak on the Global South, on theses, and will be skewed to commercial publishers and large institutions.
Here's the scale for 7 days:
18556
25599
18571
8446
43150
16483
19576
Treat this with caution as we haven't listed the types yet. I think some contributors don't deposit every day, but it gives an idea of the scale - 150K scholarly objects / week. I imagine that most/all have DOIs.
Here's an analysis of the type
s in CrossRef. I imagine these are assigned by the publisher:
journal-article,106863
book-chapter,19487
component,11071
proceedings-article,4953
dataset,1806
journal-issue,1383
other,1022
report,806
monograph,766
book,670
journal,670
reference-entry,363
dissertation,215
standard,154
report-series,67
proceedings,33
journal-volume,26
book-section,12
reference-book,12
All these data will be published as CC0.
There were ca 1000 publishers during that one week; here are the leading 50. Again caution - we haven't normalized the names and some publishers may deposit irregularly. But there were some surprises for me. Divide by 7 to get the daily totals, and multiply by ~50 to get the yearly ones.
Elsevier BV,21614
Springer Nature,16576
Informa UK Limited,9945
Wiley-Blackwell,7703
Public Library of Science (PLoS),6909
Cambridge University Press (CUP),6673
Walter de Gruyter GmbH,5383
Organisation for Economic Co-Operation and Development (OECD),3309
Institute of Electrical & Electronics Engineers (IEEE),2511
Virginia Tech Libraries,2429
SAGE Publications,2163
Oxford University Press (OUP),2097
MDPI AG,2097
Brill Academic Publishers,1718
University of Arizona,1701
Copernicus GmbH,1486
Royal Society of Chemistry (RSC),1454
American Chemical Society (ACS),1352
OpenEdition,1351
PERSEE Program,1306
Ovid Technologies (Wolters Kluwer Health),1111
China Science Publishing & Media Ltd.,1108
Thieme Publishing Group,1054
Shanghai Institute of Optics and Fine Mechanics,929
AIP Publishing,892
Edward Elgar Publishing,889
Association for Computing Machinery (ACM),862
IOP Publishing,796
Harvard University Press,722
EDP Sciences,659
QIRT Council,619
RCN Publishing Ltd.,587
SWGE Sistemas,566
eLife Sciences Organisation, Ltd.,564
American Physical Society (APS),523
Medknow,512
American Scientific Publishers,494
American Society of Civil Engineers (ASCE),492
National Institute of Standards and Technology (NIST),484
OMICS Publishing Group,477