This database was created as part of the project Closing the Gap in Non-Latin-Script Data under the auspices of the Berlin University Alliance and the Freie Universität Berlin, which aimed to improve the agency and visibility of NLS-related research in the Digital Humanities in Germany. One part of the project was to get an idea of the state of the field of “Digitale Arabistik” in Germany, mainly in terms of completed, ongoing, and future research projects, but also in terms of available infrastructure and relevant initiatives in the field.
As we began gathering data on digital projects dealing with Arabic or similar languages, we thought about how to provide this data in a way that commits to OpenScience principles. So we chose a public Git repository as our main data store, offering the data as JSON in a way that should be as straightforward as possible. Everyone who is interested should be able to contribute without having to deal with too much of a technology stack.
For those who are not able or willing to work with JSON, we developed a graphical user interface—this website—which offers not only a list of all projects for which we have collected data, but also small visualizations (on which we continue to iterate). For an easier way to provide data than writing plain JSON, we offer a web form that can generate a file conforming to our schema. All of these features will be improved over time, depending on the needs and demands of the community.
Xenia is responsible for supervising the database and workflows of the project. Her current interests include the sustainability of human heritage in the digital era, workflow optimization with AI, exploring the limits of NLP, and as much tech-related stuff as she can still fit into her schedule.
Aibaniz, with a background in Arabic Studies and digital humanities, works on Closing the Gap by identifying Arabic-focused and underrepresented projects, collecting accurate metadata, and developing tools to structure it. She also contributes to research on DH project sustainability and designed the project’s logo and logotype.
Joudy Sido Bozan works at the intersection of digital humanities and programming, developing tools to collect, organize, and visualize research projects in non-Latin scripts. Her work focuses on automating data extraction and creating visual representations, ensuring accessibility, sustainability and structured analysis of research in underrepresented languages.
If you want to contribute in any way, to work on a localization of this service or to report bugs or feature requests, feel free to contact us via GitHub or E-Mail . For ongoing updates, detailed discussions, and more information about our project, visit our blog.
Other initiatives and institutions with which we are involved include: