Dominika Tkaczyk

Dominika Tkaczyk

Head of Strategic Initiatives

Biography

Dominika joined Crossref’s R&D group in the Tech team in August 2018. Her research interests focus on machine learning and natural language processing, in particular their applications to the automated analysis of scientific literature and research outputs. Previously, she has worked on a number of projects, including the extraction of machine-readable metadata from scholarly documents, predicting people’s demographic features based on their internet browsing history, and developing new metrics for assessing the effectiveness of worldwide air traffic. Dominika’s career started in Poland, where she was a researcher and a data scientist at the University of Warsaw. She received a PhD in Computer Science from the Polish Academy of Sciences in 2016. In 2017 Dominika was awarded a Marie Sklodowska-Curie EDGE Fellowship and moved to Ireland to work as a postdoctoral researcher at Trinity College Dublin. When not busy training yet another random forest or neural network, you can find her at the nearest Doctor Who convention or rock/metal concert.

Dominika Tkaczyk's Latest Blog Posts

Follow the money, or how to link grants to research outputs

Dominika Tkaczyk, Thursday, Sep 1, 2022

In GrantsLinkingCrossref Labs

Leave a comment

The ecosystem of scholarly metadata is filled with relationships between items of various types: a person authored a paper, a paper cites a book, a funder funded research. Those relationships are absolutely essential: an item without them is missing the most basic context about its structure, origin, and impact. No wonder that finding and exposing such relationships is considered very important by virtually all parties involved. Probably the most famous instance of this problem is finding citation links between research outputs. Lately, another instance has been drawing more and more attention: linking research outputs with grants used as their funding source. How can this be done and how many such links can we observe?

Double trouble with DOIs

Dominika Tkaczyk, Thursday, Sep 1, 2022

In Crossref LabsMetadataMetadata Quality

Leave a comment

Detective Matcher stopped abruptly behind the corner of a short building, praying that his loud heartbeat doesn’t give up his presence. This missing DOI case was unlike any other before, keeping him awake for many seconds already. It took a great effort and a good amount of help from his clever assistant Fuzzy Comparison to make sense of the sparse clues provided by Miss Unstructured Reference, an elegant young lady with a shy smile, who begged him to take up this case at any cost.

Crossref metadata for bibliometrics

Ginny Hendricks, Thursday, Sep 1, 2022

In MetadataBibliometricsCitation DataAPIsAPI Case Study

Leave a comment

Our paper, Crossref: the sustainable source of community-owned scholarly metadata, was recently published in Quantitative Science Studies (MIT Press). The paper describes the scholarly metadata collected and made available by Crossref, as well as its importance in the scholarly research ecosystem.

What's your (citations') style?

Dominika Tkaczyk, Thursday, Sep 1, 2022

In CitationCrossref LabsMachine Learning

Leave a comment

Bibliographic references in scientific papers are the end result of a process typically composed of: finding the right document to cite, obtaining its metadata, and formatting the metadata using a specific citation style. This end result, however, does not preserve the information about the citation style used to generate it. Can the citation style be somehow guessed from the reference string only? TL;DR I built an automatic citation style classifier. It classifies a given bibliographic reference string into one of 17 citation styles or “unknown”.

What if I told you that bibliographic references can be structured?

Dominika Tkaczyk, Thursday, Sep 1, 2022

In LinkingCitationCrossref LabsReference Matching

Leave a comment

Last year I spent several weeks studying how to automatically match unstructured references to DOIs (you can read about these experiments in my previous blog posts). But what about references that are not in the form of an unstructured string, but rather a structured collection of metadata fields? Are we matching them, and how? Let’s find out.

Read all of Dominika Tkaczyk's posts »