The availability of Next Generation Sequencing platforms and powerful bioinformatics tools for sequence and data analysis is boosting the number of studies based on high-throughput sequencing of amplicons from food microbial communities. Although most journals require the deposit of sequences in public repositories (such as the NCBI Sequence Read Archive), accessing and using these data requires time and bioinformatics skills.
Among the many tools used to visualize data from sequencing-based studies, network analysis is particularly appealing because it provides a way to quickly visualize relationships between food matrices and microbes identified as OTUs (Operational Taxonomic Units). Although network analysis tools are more related to exploratory data analysis than to inferential statistics, they are relatively easy to use and the visualizations they provide are intuitive and appealing. Cytoscape is perhaps the most frequently used tool (and provides plugins for the creation and analysis of co-occurrence and correlation networks) but other, more powerful tools are available to expert users. With this in mind it is tempting to imagine that it might be possible to create a database and a representation tool, which may allow users to carry out meta analysis of food microbial communities in a rapid and efficient way. Such a tool may benefit the scientific community and industrial or public stakeholders in several ways:
- by providing access to a large set of curated data on the occurrence of different taxa in foods to facilitate the process of writing original articles and reviews
- by fostering open access to microbial ecology data
- by improving our understanding of the ecology of spoilage microorganisms and of microorganisms with beneficial use in foods
- by providing large data sets to formulate and validate hypotheses on the structure and dynamics of food microbial communities
- by providing information for food process development
A small consortium of research groups (a list is published at the bottom of this page) has therefore agreed to share data from published and unpublished studies to create a demo for the initiative.
Foodmicrobionet has been created with Gephi (a network visualization tool which has been originally developed for social sciences and which provides more control on visualization compared to Cytoscape) and currently includes data from 9 studies on the composition of the bacterial microbiota of dairy products or undefined dairy starter cultures. In its current version (0.6.2) the network includes 879 nodes (262 sample nodes and 617 OTU nodes) with 6015 edges (OTU-sample connections). This makes it is by far the largest collection of data in this field.
The figure on the left shows a static representation of the whole network with the colour of nodes representing food categories or OTUs (different colours are used for different families). Here and in the figures below the relative importance of the OTUs in the dataset is shown by the size of their nodes which is related to the weighted degree (i.e. the weighted sum of all outgoing edges for samples, which by default sums to 100 or the weighed sum of all incoming edges for OTUs); the thickness of the edges connecting a given sample with a given set of OTUs represent the % occurrence of OTUs. A Yfan Hu layout has been applied to the network to highlight similarities among samples, proximities between samples and OTUs, which dominate their microbiota, and to identify core microbial communities in different samples.
Although the figure might be pretty, it provides little insight for any user. The real power of the application is however in the possibility of rapidly filtering and processing data to obtain visualization at different levels of depth. By itself, Gephi offers rapid selection tools to the user. Two examples, one in which all sample nodes in which S. thermophilus appears and one in which all OTUs occurring in a single sample are selected “on the fly” are provided below:
Simple or complex filters can be used to select subsets of the microbiota in samples belonging to a given food category. In the examples below two networks for Mozzarella and undefined starter cultures are shown. In both cases filters for the dominating OTUs were applied.
Taxa-specific networks can be easily extracted and visualized. An example with all sample nodes in which Streptococcaceae are found is shown below.
The structure of the data tables used in FoodMicrobionet is currently evolving. In version 0.6 edge tables include fields for source (food) and target (OTU) nodes, weight (the frequency of the OTU in the sample, as %) and a number of fields for filtering purposes. Node tables include metadata for each OTU and food sample (including labels, taxonomic lineages, out links to other resources). This can be used for the selection of sample and OTU nodes based on different properties and to apply partition and ranking styles to nodes. Specifications for nodes and edges can be found here.
In the future more fields will be added with information on ecologically relevant properties of foods (aW, pH, temperature of storage or of production, main ingredients)
Furthermore, interactive visualization of the network or sub networks extracted by filtering can be obtained by using the Sigmajs exporter plugin of Gephi, or by similar tools. An example of an interactive graph can be found here.
Edges tables can be also used for post processing in other statistical software and both edges and nodes tables can be easily imported in Cytoscape.
The future plans of our group include:
- establishing a consortium agreement
- advertising the initiative by publishing a paper on a leading food microbiology or microbial ecology journal (updates of FoodMicrobionet may be published at defined intervals)
- recruiting more partners both from the scientific community and from the industry
- finding sponsorships from publishers, scientific societies, stakeholders
- participating to calls for a COST project
- participating to calls for funding within the framework of Horizon 2020
Are you interested? Please feel free to contact us by E-mail
To submit data to FoodMicrobionet please follow these steps:
- contact us by E-mail
- fill and return a signed Expression of Interest form on official paper of your Institution. Send a scanned version (.pdf, .jpg) by E-mail.
- prepare your data. At this stage FoodMicrobionet accepts only data from published studies. Three files are required
- a Microsoft Excel file with metadata on the project (an example is provided here)
- a Microsoft Excel file with sample metadata (an example is provided here)
- OTU abundance data for each sample in the form of either a rectangular matrix (this is the preferred format; an example is provided here) or edges and nodes tables generated using the QIIME script make_OTU_network.py
List of participating groups:
- Main contributors
- Prof. Eugenio Parente, Laboratory of Industrial Microbiology, Università degli Studi della Basilicata, Italy.
- Prof. Danilo Ercolini’s group, Università degli Studi di Napoli “Federico II”, Italy.
- Prof. Luca Cocolin’s group, Università degli Studi di Torino, Italy.
- Dr. Paul Cotter’s group, Teagasc Food Research Centre, Moorepark.
- Prof. Marco Gobbetti’s group, Università degli Studi di Bari “Aldo Moro”
- Other contributors
- Prof. Erasmo Neviani’s group, Università degli Studi di Parma, Italy