Generative AI Overcomes Data Scarcity and Drives Innovation

Satish Kumar Barnala

Enterprise Architect, Business Transformation Group

Services

Highlights

Developing effective AI/ML models requires large high-quality data sets that are often classified as ‘sensitive’, difficult to collect, and expensive to utilize.
Generative AI models and generative adversarial networks (GAN) can address the above challenges through the generation of synthetic data.
Generative AI helps organizations traverse data regulations, imbalanced datasets, data breaches, and capital spent on collecting data.

Collecting real data

Effective AI/ML models require large volumes of high-quality data, but collecting accurate and usable data is not always easy.

Data that includes personal identifiable information (PII) or personal health information (PHI) is vital for AI/ML models to solve complex business problems. With strict data privacy regulations such as GDPR and risks of data breaches, enterprises find it difficult to depend solely on real data. Also, the time taken to collect real data and the cost of procuring it poses a challenge. Enterprises need to find better ways to collate and leverage data that helps them achieve their business goals.

Take application or product testing for instance. It requires a huge amount of real-world data that may be difficult and time-consuming to procure. In finance, there is a lot of talk about how AI can help fight fraud. Getting sufficient data to train ML models to predict fraud (or anomalies) is challenging because fraudulent transactions aren’t very common. Lastly, datasets that are made available for analysis might be imbalanced due to inaccurate class representation. All these reasons reinforce the need for generating synthetic data to resemble real-world data. It can be used to build and train ML models accurately.

Generative AI

Create synthetic data that reflects important statistical properties of underlying real-world data.

With synthetic data, enterprises can address uncertainties around the availability of real-word data. Recent developments in generative AI models and algorithms can potentially ensure accurate representation of real data in synthesized data.

Generative AI models such as generative adversarial networks (GAN) are adept at discovering structures and patterns in a data set. These patterns can be used for creating synthetic data to overcome data shortages during AI/ML implementations. GANs are a powerful class of neural networks that are used for unsupervised learning. They are made up of a system of two competing neural network models, which compete and analyze, capture, and copy variations within a dataset.

Use cases

Data is wealth-high-quality synthetic data that eliminates the privacy constraints of real-world data is invaluable for any industry.

From identity protection and anonymity in sensitive situations to helping remove biases during recruitment, generative AI has a lot of potential across industries.

Media and entertainment: In dubbed foreign language films, generative AI can improve the viewer’s experience by accurately syncing lip movements during dialogues. It can also restore old images and movies by converting low-resolution visuals into highly detailed imagery.

Healthcare: Generative AI models can improve the functioning of prosthetic limbs by observing its bearer’s movements. It can produce high-quality speech for people with speech impairments and detect potential diseases early through applications such as generating different angles of an x-ray image.

Banking and financial services: Financial businesses and banks can use synthetic data for technology solutions and build a robust foundation that drives better data-driven predictions. This allows businesses to improve their price forecasting accuracy and optimize their portfolios.

Manufacturing: Generative design is a key area in manufacturing where engineers can input their design objectives such as material details, manufacturing processes, and cost constraints. The program analyzes all the input parameters and generates design options that covers every possible alternative.

Challenges

While synthetic data is less expensive than collecting real data, there are questions that enterprises need to address before adopting generative AI.

It can be challenging to get an enterprise’s stakeholders and business owners to collectively agree on the usage acceptance criteria of synthetic data. This data may reflect biases present in source data. Further exploratory data analysis will be needed to eliminate biases before generating synthetic data. Enterprise data maturity assessment models must be used to identify gaps in existing data and analytics programs to develop specific approaches for plugging them through synthetic data.

Synthesized data may also fail to produce outliers of real-world data. However, based on the importance of outliers for a given business application, data scientists can treat the outliers separately. They could produce synthesized outlier data with generative AI to represent the actual data realistically.

"Gartner estimates that by 2024, 60% of the data used for the development of AI and analytics projects will be synthetically generated."

About the author

Satish Kumar Barnala

Satish Kumar Barnala is an enterprise architect and leads transformation and innovation initiatives for the Business Transformation Group (BTG) at TCS.

Write to me

Infrastructure to Intelligence

About Us

TCS Insights

Upcoming events

Recent recognitions

Want to be a global change-maker? Join our team.

Find the latest news about TCS

Recent Press Releases

Recent News

Infrastructure to Intelligence

About Us

TCS Insights

Upcoming events

Recent recognitions

Want to be a global change-maker? Join our team.

Find the latest news about TCS

Recent Press Releases

Recent News

Generative AI: Address enterprise goals with synthetic data

Services

Highlights

On this page

Collecting real data

Generative AI

Use cases

Challenges

About the author

Satish Kumar Barnala

Know more about generative AI

Find out more

Infrastructure to Intelligence

About Us

TCS Insights

Upcoming events

Recent recognitions

Want to be a global change-maker? Join our team.

Find the latest news about TCS

Recent Press Releases

Recent News

Services

Highlights

On this page

Collecting real data

Generative AI

Use cases

Challenges

About the author

Satish Kumar Barnala

Related reading

Know more about generative AI

Find out more

Accessibility Adjustments