KANTİTATİF VE KALİTATİF ARAŞTIRMALARDA SENTETİK VERİ KULLANIMI


Öçal S., Abdurahmanova2 Y., Arıcıgil Çilan Ç.

IRSYSC 2024 VIII. INTERNATIONAL RESEARCHERS, STATISTICIANS, AND YOUNG STATISTICIANS CONGRESS NOVEMBER 28-30, 2024, Adana, Türkiye, 28 - 30 Kasım 2024, ss.127, (Özet Bildiri)

  • Yayın Türü: Bildiri / Özet Bildiri
  • Basıldığı Şehir: Adana
  • Basıldığı Ülke: Türkiye
  • Sayfa Sayıları: ss.127
  • İstanbul Üniversitesi Adresli: Evet

Özet

Synthetic data is data generated by artificial intelligence to reduce the cost of research and save

time when real data is not available. Synthetic data includes not only entered numerical data

and codes, but also texts, images, audio, and video recordings. When the studies conducted with

synthetic data in the literature are examined; it is seen that synthetic data is mainly used in the

fields of tourism, health, finance, automotive, and market research. Generative artificial

intelligence is utilized in the production of synthetic data. Machine learning techniques such as

Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs), Large Language

Model (LLM), Retrieval-Augmented Generation (RAG), Reinforcement Learning for Human

Feedback (RLHF), Agent-Based Modeling (ABM) are generally used to generate synthetic

data. In quantitative and qualitative research, the use of synthetic data has become widespread

in cases where sufficient data cannot be collected, and confidentiality of the study is a priority.

Synthetic data, which can be used in qualitative research due to the problem of finding

participants and the high costs of face-to-face interviews, can simulate real-world scenarios,

interviews, and observations with the help of artificial intelligence. Again, large language

models can be used to create a synthetic data set that can mimic real data. In quantitative

research, synthetic data is used to assign missing observations with artificial intelligence

systems. In this study, studies on synthetic data used in quantitative and qualitative research are

reviewed.

Key Words: Synthetic Data, Quantitative- Qualitative Research, Generative AI