Great white sharks enjoy what can only be described as a complex relationship with humankind. On one hand, they have been hunted into official “vulnerable” status, mainly because of demand for their valuable fins or as a result of regional culls. On the other, they are subject to a growing number of protection and conservation schemes that seek to boost population numbers and ban often cruel hunting practices.
Sharks are not just at risk because of culls driven by fears they will attack people out at sea, but because they can get caught up in large-scale fishing equipment or damaged by climate-driven change to their habitats. Fins support a number of lucrative lines of business.
Shark fins, in general, have long been prized as a key ingredient in shark fin soup, a Chinese dish served at celebrations and banquets. It can be traced as far back as the Emperor Taizu of the Northern Song, who reigned from 960 to 976. Legend states that he introduced shark fin soup to showcase his wealth, power and generosity.
Popularity of the soup has declined globally in recent times. This is partly as a response to conservation efforts but also because of reports about high levels of mercury and methylmercury in concentrations high enough to be considered unsafe for human consumption.
However, claims that low instances of cancer in sharks together with their remarkable ability to recover from injuries has driven a new market for shark cartilage-based medicines, pills and powders. Despite multiple clinical trials that have disproved the theory that ingesting shark cartilage can cure cancer, demand for such products continues to exist.
As the world’s largest known predatory fish and one of the oldest established species on the planet, great white sharks play a vital role in balancing the global food chain. However, there is a relative shortage of information on record (compared to land-based animals) about how great whites live their lives.
is also up for debate and is changing as a result of preservation efforts for their food sources, such as seals. Much has been written about the return of great white sharks to Cape Cod, as well as their disappearance from South African shores.
Uncovering information about the great white shark is an important element of preserving its global population—and the vital efforts to identify, classify and understand available data are increasingly driven and supported by modern technology.
Tracking Great Whites off the California Coast
The world famous Monterey Bay Aquarium, for example, has worked for the past 15 years to track individual white sharks in the eastern Pacific area as part of its Conservation Research and Education Program.
The aquarium and research partners, including Stanford and California State Long Beach Universities, use electronic-tracking tags and genetic analyses using tissue and blood samples collected from adult and juvenile sharks to identify and monitor the great white shark population. This is making significant contributions to knowledge about where white sharks in the eastern Pacific Ocean travel, where they live and about their basic physiology.
“Of the 501 or so species of sharks, 79 are imperiled, according to the International Union for the Conservation of Nature,” the aquarium says. “White sharks are top predators in the sea, but they’re in great danger of being depleted.”
Programs coordinated by the aquarium include juvenile white shark research. This involves fitting juveniles with an externally attached, pop-up satellite tag with a tiny computer that collects and stores data on temperature, depth and light, which are used to estimate the juvenile’s position.
On a pre-programmed date, the tag pops off and floats to the surface. At the surface, it transmits data to the aquarium via satellite. If the tag is recovered, even more data are retrieved. To date, scientists have tagged and tracked 18 juveniles and 167 adults.
The aquarium also promotes policies in the U.S. and internationally to end practices that threaten all shark species: the targeting of sharks for their lucrative fins, and the use of indiscriminate fishing practices that catch and kill sharks in gear intended for other species.
More recently, it has introduced Zegami data visualization software to bring structure to its large collection of visual data, which includes pictures of great white sharks spotted off the coasts of California, right down to Baja, Mexico.
So far, the research team has identified and researched 475 individual white sharks. This has been done by a team of researchers from the aquarium and Stanford University with help from members of the public, including fishermen and divers, who submit photographs when they spot sharks in the wild.
During the height of the white shark season off Central California each autumn, researchers visit elephant seal rookeries at the Farallones, Año Nuevo Island and Point Reyes. The rookeries attract large groups of white sharks, which surface the ocean as the team launches seal decoys. As the sharks emerge, researchers take the opportunity to take photos of their dorsal fins.
As a result of this activity, the aquarium has built a catalog of over 2,000 images of great white sharks’ fins that help its research team track the movements of sharks. Sharks are easily distinguished by their fins, with each shark’s dorsal fin having a series of notches and ridges as individual as a human fingerprint.
Scientists believe these notches and ridges are caused by cuts and scrapes, and by the bites of parasites called copepods when the sharks are young and their skins are soft. Over time the bites heal over and create unique patterns that mean they can be individually identified.
So familiar has the aquarium’s team become with local great white sharks, which return to the California area on an annual basis, that they have been able to give individual names to many of them.
While some have descriptive names like “Middle-notch,” or “Split-fin,” one is named “Tom Johnson” after the naturalist Tom Johnson, who took the first photo of this shark in 1987, and “Leno” has a fin resembling the profile of the famous U.S. late-night TV host Jay Leno.
Dr. Salvador Jorgensen, senior research scientist at Monterey Bay Aquarium, observed: “White sharks are probably the best known apex marine predator on the planet, so it’s no surprise there’s a huge amount of interest around them—and the amount of photos we’re sent is testament to that. We have some sharks on record that have been tracked over 25 years. However, it was becoming increasingly challenging to manage our growing collection.”
Visualizing Unstructured Data
The aquarium chose Zegami data visualization technology because it provides a simpler way to navigate and catalog its collection and quickly identify an individual shark in the images. The software allows the team to display all of its pictures on a single screen, making it much quicker for them to scan through for a match to a new photograph, for example.
The software was first developed in response to the need for easier ways to search through unstructured data like images and sound, which now make up as much as 85 to 90 percent of the information stored by organizations.
Originally inspired by Microsoft PivotViewer software, Zegami was founded in 2016 after a successful collaboration with the Weatherall Institute of Molecular Medicine at the University of Oxford.
The many challenges of this research program meant that Zegami was developed so that it could be used to visualize any kind of data, including images, video, documents or more traditional data sources like databases and spreadsheets.
Examples of projects using Zegami have included applications as diverse as the Pittsburgh Pirates analyzing data on thousands of players from the college system and around the world to find the best talent available, and the Australian Plant Phenomics Facility monitoring large-scale investigations into plant growth in variable conditions.
Most data visualization tools are built with structured data in mind, due largely to the fact that working with unstructured data requires a wholly different understand- ing and approach.
Unstructured data might comprise images, video, emails, documents and social media posts. Working with such data can be difficult due to its very nature: file sizes are frequently large, so they are difficult to collect, store and move around.
Processing this kind of data requires a lot of computational power and memory, frequently exceeding the capabilities of a single machine. It is only relatively recently, with the advance of cloud computing, that the capability to manage data of this kind effectively has become more accessible.
How Does the Technology Work?
Metadata (data that describes, identifies and signposts data) is an important element of storing unstructured data like the aquarium’s photographs. On average, we can spend up to 2.5 hr. a day just searching for information among unstructured, uncategorized data to help us do our jobs.
Unlike structured data, large, disparate sources of data must be processed and analyzed before they can be qualified, understood and then effectively used.
Humans are extremely good at understanding the messy, unstructured nature of the world, and making sense of unstructured data has normally been the preserve of people examining records outside of any database.
But applying human intelligence directly to the scale of data being amassed in the information age is simply not feasible. The processing performance of the human brain has its own limitations and cannot scale at the same rate that unstructured data are growing.
This is where machine learning and artificial intelligence (AI) enter the picture. These techniques have the ability to consume large amounts of data streamed from varied sources and to make sense of it all in a systematic way. Through the large-scale processing of hundreds of thousands or even millions of pieces of data and content in a relatively short period of time, AI can extract quantifiable information in a manageable format. And by delegating tasks such as face detection, object detection and metadata extraction to AI, humans can step away from the repetitiveness of manually annotating data to make better use of their cognitive functions.
This is important because, in many cases, the evaluation of content is highly subjective and not suited to processing even by advanced AI techniques. Such crucial, subjective judgments might relate to financial, quality, research or progress decisions over large visual data sets. Key decisions can be made with more speed and more clarity when salient information has been extracted automatically from thousands or even millions of files.
The aquarium uses its extensive collection of photographs to determine whether the population of great white sharks is stable or has changed in the last five to 10 years in line with conservation programs. One of the many challenges for great white sharks is that they take a long time to reach adult maturity, which means any recovery from decline will be slow to manifest itself.
The research will also help to identify areas that are important for feeding or as nurseries. “This is more important than ever at a time when the health of our oceans and the safety of our sharks are both under critical threat,” said Jorgensen.
From Zegami’s point of view, we’re delighted to be able to help Monterey Bay Aquarium with this project: The team’s passion for ocean health and the preservation of white sharks is hugely inspiring. We’ve been able to work across a large collection of images to make a powerful tool for them, which we hope will contribute to the conservation of these threatened sea creatures.
Our ultimate aim with Zegami is to build technology that enables organizations to recognize patterns quickly, understand information and find hidden insights by allowing users to search, sort, filter, group and analyze large collections of images and data simply and intuitively. We want to make data visualization into a technique that is available to everyone, not just data scientists. The Monterey Bay Aquarium image library is a great example of how that can work in practice.