The goal of the AqQua project is to:
- build an AI foundation model of plankton image data,
- roll out the model as an open-source tool for the global community for the purpose of facilitating everyone’s plankton-related research,
- leverage this model to develop global plankton- and particle distribution models and estimate global plankton- and particle mediated process rates.
To achieve this, the AqQua project aims at collecting a diverse plankton image dataset from a variety of imaging devices (e.g. Underwater Vision Profilers, Zooscan, PlanktoScope, IFCB, FlowCam, …) deployed across different aquatic habitats worldwide. To assemble a most diverse and extensive dataset, the AqQua project encourages scientists around the world to share their plankton- and particle image data. In return, everyone sharing data will be included as author of a planned data paper. Furthermore, everyone sharing data will be invited to actively contribute to a respective foundation model paper, as well as to global distribution- and process rate papers. We therefore reach out to you, even if your data is already in the public domain. The AqQua foundation model will, most likely, perform better on the kinds of data it has been trained on, thus likely particularly facilitating the own research of data providers. The AqQua project will not analyze any provided dataset in isolation nor perform any respective local analyses.
Due to our full commitment to Open Science, all data shared with the AqQua project has to come with permission to be made publicly available earliest July 15, 2027 under CC BY-NC 4.0 license as part of our planned data paper. Thus we are exclusively seeking data with a moratorium of July 15, 2027 at the latest. The AqQua project aims to publicly release trained foundation models after November 1, 2025.
For the purpose of training a foundation model, the AqQua project requires the image data (including scale information), as well as at least latitude, longitude, depth, date and time of observation. Classification labels (e.g. species or particle type) and trait annotations (e.g. egg-carrying) are very welcome as these can help fine-tune and benchmark the foundation model, but are not required. We would also appreciate it if you would share sample unit definition and the sampled volume information for your samples, to enable us to develop global distribution models and to estimate process rates (e.g. as in Laget et al. 2024, Clements et al. 2022 & 2023). Image data of (mono)cultures is also welcome and in this case, metadata should indicate the original sampling location, date and time.
AqQua is a moonshot project that relies on comprehensive collaboration with academic labs and non-academic stakeholders across the globe. We sincerely hope that you will join AqQua's mission, which will not just yield maximally powerful AI for the benefit of all plankton-related research, but will also pave the way towards operational global mapping and monitoring of biodiversity, ecosystem health and carbon flux at unprecedented accuracy and granularity, thereby serving to aid decision making in times of global change. We are very happy that we have already received overwhelming signals of support from the global community, with more than 40 academic labs and non-academic stakeholders across the globe pledging to share data and contribute expertise. AqQua's mission can only be achieved by global collaboration. We would be stoked to have you on board!
To participate, please carefully read and fill the online form below. Note, after clicking the "Submit" button at the end of the form, you will be directed to a "print your answers" button, which allows you to download a pdf of your filled form for your records.
If you have any questions or suggestions, please contact us at aqqua@geomar.de.