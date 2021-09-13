How to Create Dummy Data in Python

Dummy data is randomly generated data that can be substituted for live data. Whether you are a Developer, Software Engineer, or Data Scientist, sometimes you need dummy data to test what you have built, it can be a web app, mobile app, or machine learning model.

If you are using python language, you can use a faker python package to create dummy data of any type, for example, dates, transactions, names, texts, time, and others. Faker is a simple python package that generates fake data with different data types.

Faker package is heavily inspired by PHP Faker, Perl Faker, and by Ruby Faker.

In this article, you will learn a different way to create Dummy data by using the Faker python package.

Table of Contents:

Faker Installation Create Faker Generator Create Names Create Dates and Times Create Personal Profile Create Sentence & Paragraph Data Create Localized Data Create the same Fake Data Reference

How to Install Faker to Create Dummy Data

You can install the package with pip as follows:

pip install Faker

Note: From version 4.0.0, Faker dropped support for Python 2 and from version 5.0.0 it only supports Python 3.6 and above.

Create Faker Generator for Dummy Data

To create and initialize a faker generator use the Faker() method.

from faker import Faker fake = Faker()

Now you can start creating different dummy data you want.

Create Names

You can use the name() method to create full fake names.

for _ in range( 10 ): print(fake.name())

Mathew Brown

Mrs. Julie Chavez

Calvin Little

Manuel Ponce

Alyssa Jackson DVM

Amy Delgado

Matthew Smith

Sarah Rojas

Crystal Werner

Tina Moore

Note: You can also use the first_name() method to create the first name and the last_name() method to create the last name.

Create Dates and Times

If you are working with dates, faker provides different ways to create fake dates and times. In the following examples, you will learn 10 different ways to create dummy dates and times data.

print(fake.date_between(start_date= "-3y" ,end_date= "-1y" )) # date between 2018 and 2020 print(fake.month()) print(fake.date_time()) print(fake.year()) print(fake.month_name()) print(fake.date_time_this_year()) print(fake.time()) print(fake.timezone()) print(fake.day_of_week()) print(fake.time_object())

2019-05-31

02

2012-05-31 17:53:01

2002

November

2021-06-30 00:34:48

08:17:51

Africa/Gaborone

Thursday

17:59:37

Create a Personal Profile

If you want to create fake personal and identity information you can use the profile and simple_profile methods from the faker library.

The simple_profile method creates a fake basic profile with personal information such as name, gender, mail, and address.

generateProfile = Faker() generateProfile.simple_profile()

{'username': 'qfowler',

'name': 'Matthew Greene',

'sex': 'M',

'address': 'USNV Lopez

FPO AA 45803',

'mail': '[email protected]',

'birthdate': datetime.date(1995, 8, 14)}

The profile method creates fake personal profiles and identities such as job, company, residence,blood_group, current_location, and others.

generateProfile.profile()

{'job': 'Designer, television/film set',

'company': 'Murillo, Short and Townsend',

'ssn': '893-14-6729',

'residence': '6596 Daniel Spring Suite 910

Jonesborough, ID 59049',

'current_location': (Decimal('4.2622025'), Decimal('-39.109752')),

'blood_group': 'O-',

'website': ['https://hardin-johnson.org/',

'https://patterson.com/',

'https://george-snyder.info/'],

'username': 'samuelbooth',

'name': 'Shawna Spencer',

'sex': 'F',

'address': '125 Darrell Extension Suite 575

Port Michaelbury, PA 12381',

'mail': '[email protected]',

'birthdate': datetime.date(1989, 11, 25)}

You can also create more than one profile and save the profile data into a pandas data-frame for analysis. In the following example, we will create 1000 profiles with just 3 lines of code.

import pandas as pd generateProfile = Faker() # generate 1000 profiles data = [generateProfile.profile() for i in range( 1000 )] # save profiles in pandas dataframe df = pd.DataFrame(data) print(df)

Let’s observe the column names of the 1000 profiles created.

print(df.columns)

Index(['job', 'company', 'ssn', 'residence', 'current_location', 'blood_group',

'website', 'username', 'name', 'sex', 'address', 'mail', 'birthdate'], dtype='object')

We have 13 columns in the dataset. Now you can use the dummy data you generate for data analysis and visualization.

Create Sentence & Paragraph Data

If you are working on a software project, you can use the Faker library to generate fake text data to test some features in your web or mobile app. The Faker library provides 4 different methods to create text data as follows.

(a) Create a Single Paragraph

generateText = Faker() generateText.text()

'Goal everything traditional to. Suggest stage stop international. Hold line south across new charge national.

Close money commercial success force. Five decision even environment notice every.'

(b) Create Multiple Paragraphs

generateTexts = Faker() generateTexts.texts()

['Together require growth wind picture raise. Production task tree consumer recognize personal.',

'Be six whose answer. Mr oil successful under particular option.

Step nor once rise. Eye thank try stay only test service. Then senior within capital action. Gun already entire sign garden.',

'Painting now term direction. Will inside natural bar purpose major.

Other hear subject do their. Institution between education would laugh example on. Real statement kid specific able foreign.']

(c) Create a Single Sentence

generateSentence = Faker() generateSentence.sentence()

'Pass front responsibility.'

(d) Create Multiple Sentences

generateSentences = Faker() generateSentences.sentences()

['Maintain take star someone could kitchen employee.',

'Pay should own word begin.',

'Citizen place although old despite stay.']

Create Localized Data

Faker library supports the creation of localized data. You need to pass the locale as an argument to the Faker class, by default it supports en_US locale.

You can find a list of localized providers here.

In the following example, we will create 10 names from China.

fake_local = Faker( 'zh_CN' ) for _ in range( 10 ): print(fake_local.name())

李小红

赵桂香

陈小红

罗建华

宋华

刘秀芳

郭秀华

朱秀云

金艳

侯琴

You can also set multiple locales from version 3.0.0.

multiple_fake = Faker([ 'uk_UA' , 'en_US' , 'ja_JP' ]) for _ in range( 10 ): print(multiple_fake.city())

長生郡長生村

Christieland

Rileyshire

長生郡白子町

Port Curtisborough

Pruittview

селище Одарка

хутір Богодар

село Альберт

横浜市都筑区

In the above example, we created multiple cities from 3 different locations.

Create the Same Fake Data

To create the same fake data output, you need to seed the fake generator and then you can run the same code.

myGenerator = Faker() myGenerator.random.seed( 1234 ) for i in range( 10 ): print(myGenerator.country())

Slovakia (Slovak Republic)

Kazakhstan

Brazil

Albania

Bermuda

United States Minor Outlying Islands

Western Sahara

Wallis and Futuna

Sri Lanka

Mozambique

Note: You can use any random number as a seed.

Reference

