paint-brush
693 Stories To Learn About Databy@learn
773 reads
773 reads

693 Stories To Learn About Data

by Learn RepoJanuary 6th, 2024
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Learn everything you need to know about Data via these 693 free HackerNoon stories.

People Mentioned

Mention Thumbnail
Mention Thumbnail

Company Mentioned

Mention Thumbnail
featured image - 693 Stories To Learn About Data
Learn Repo HackerNoon profile picture

Let's learn about Data via these 693 free stories. They are ordered by most time reading created on HackerNoon. Visit the /Learn Repo to find the most read stories about any technology.

Data is the king, queen, oil, sun, and the moon.

1. The Difference Between JDBC, JPA, Hibernate, and Spring Data JPA

Connecting a database to a Java application is not an easy process. You need to consider the connection pool, the data access layer, etc.

2. An Intro to Resiliency, DHT, and Autonomous Economic Agents

According to the paper published by Lokman Rahmani et al., the S/Kademlia distributed hash table (DHT) used by the ACN is resilient against malicious attacks.

3. How the TypeScript Pick Type works

The Pick utility Type lets us take types based off existing ones, by selecting specific elements from them. Let's look at how it works and when to use it.

4. Top 10 Open Datasets for Linear Regression

On Hacker Noon, I will be sharing some of my best-performing machine learning articles. This listicle on datasets built for regression or linear regression tasks has been upvoted many times on Reddit and reshared dozens of times on various social media platforms. I hope Hacker Noon data scientists find it useful as well!

5. A Better Guide to Build Apache Superset From source

In this article, we’ll be deep-diving on how to build Apache Superset from the source. The official documentation is too complicated for a new contributor and thus my attempt to simplify it.

6. 11 Best Climate Change Datasets for Data Science Projects

Data is a central piece of the climate change debate. With the climate change datasets on this list, many data scientists have created visualizations and models to measure and track the change in surface temperatures, sea ice levels, and more. Many of these datasets have been made public to allow people to contribute and add valuable insight into the way the climate is changing and its causes.

7. Introducing CatalyzeX: A Browser Extension for Machine Learning

Andrew Ng likes it, you probably will too!

8. How To Scrape Google With Python

Ever since Google Web Search API deprecation in 2011, I've been searching for an alternative. I need a way to get links from Google search into my Python script. So I made my own, and here is a quick guide on scraping Google searches with requests and Beautiful Soup.

9. Pyth and Auros are Bringing Real-Time High-Frequency Data to Blockchain Protocols

Auros, a company specialising in algorithmic trading and market making, and Pyth Network will provide access to high-frequency data in real-time.

10. The Difference Between Privacy and Security

For many, privacy and security seem to be words that are interchangeable. Yet, you can have one without the other and users need to be aware of what they get.

11. Ruby: How to read/write JSON File

In Ruby read and write JSON file to hash can be achieved using File Handling.

12. 6 Reasons to Utilize Sandbox Technology in Game Development

Running a successful online application is an exciting journey, But it is also full of challenges. It starts from product-market fit (PMF)

13. Crypto Singularity and Data Dignity: the Lowdown at Blockstack Summit

This 2019 has been clearly marked by a bearish wave (and also speculative events) and with that comes a breath of much needed space for the builders to have room to build the runway for the solutions proposed in the many white papers distributed all over the web.

14. How to Create a Simple Dashboard with Google Forms and Google Data Studio

Google products are generally free for use, don’t need to go overboard if you handle simple data. No Cost, Just Productive Dashboard

15. What Qualifies You To Be A Cybersecurity Professional?

Data breaches and ransomware attacks are getting more common. If you want to get in on this industry as a cybersecurity professional, you need qualifications.

16. Running a Python Script to Scrape LinkedIn Profiles From Google

LinkedIn is a great place to find leads and engage with prospects. In order to engage with potential leads, you’ll need a list of users to contact. However, getting that list might be difficult because LinkedIn has made it difficult for web scraping tools. That is why I made a script to search Google for potential LinkedIn user and company profiles.

17. Busting AI Myths: "You Need Tons of Data for Machine Learning"

Leading researchers like Karl Friston describe AI as "active inference" —creating computational statistical models that minimize prediction-error. The human brain operates much the same way, also learning from data. A common argument goes:

18. Let Data Shed Some Light in the Midst of COVID-19

The burden the COVID-19 novel coronavirus has placed on the world is enormous. There’s a great thirst for information and clarity. So, we at Logz.io have decided to offer a Community COVID-19 Dashboard Project, so that everyone can better understand how the outbreak impacts the world and their region. We see that as a community effort. We invite the global community of engineers and data scientists to add data to this public dashboard that will cover not just the direct impact of the coronavirus on public health, but other aspects of society as well. We want to help everyone better understand the impact of COVID-19 anywhere around the world.

19. Introduction to a Career in Data Engineering

A valuable asset for anyone looking to break into the Data Engineering field is understanding the different types of data and the Data Pipeline.

20. 5 Mistakes That Make AI Data Labeling Ineffective

Data labeling and annotation is one of the biggest challenges businesses face in developing AI solutions. Here are the top 5 Data labeling mistakes.

21. Reimagining Support and Resistance Indicators with Blockchain Datasets

Support and resistance are two of the best established concepts in technical analysis trading strategies. Conceptually, both support and resistance identify pricing points on an asset that favor a pause or reversal of a given trend. In traditional technical analysis, there are several indicators that model out points of support and resistance all of them are solely based on price trends. Many of those techniques can be extrapolated it to crypto-assets but I think we can do a bit better. For the first time in history, we have an asset class that records parts of the behavior of individual investors and asset holders in public ledgers. That information results a gold mine when comes to estimate objective levels of support and resistance.

22. How to Transform Your Data Into a Voice AI Knowledge Assistant

RAIN executives give a full breakdown of the build out and power of AI Voice Assistants.

23. Top 6 Data Visualization Tools for 2022

In this blog you will discover best data visualization tools to effectively analyze your datasets. Learn about the tools to create intuitive visualization.

24. 10 Best Stock Market Datasets for Machine Learning

For those looking to build predictive models, this article will introduce 10 stock market and cryptocurrency datasets for machine learning.

25. How to Create Dummy Data in Python

Dummy data is randomly generated data that can be substituted for live data. Whether you are a Developer, Software Engineer, or Data Scientist, sometimes you need dummy data to test what you have built, it can be a web app, mobile app, or machine learning model.

26. Good Ways To Make Your Data More Secure

Data security is a business challenge and a business opportunity, not a mere technical task for your IT department.

27. 10 Data Table Libraries for JavaScript

Tables are a useful tool for visualizing, organizing and processing data in JavaScript. To start using them, you need to download a free library or one for a reasonable price. Here is a list of 10 useful, functional, and reliable JS libraries that will help you work with tables.

28. An Intro to No-Code Web Scraping

Web scraping has broken the barriers of programming and can now be done in a much simpler and easier manner without using a single line of code.

29. How to get data from API in Excel

How to get data from API JSON in Excel table with the simplest tutorial with formula. Ready to go open-sourced VBA formula with intuitive video tutorial

30. Increase The Size of Your Datasets Through Data Augmentation

Access to training data is one of the largest blockers for many machine learning projects. Luckily, for various different projects, we can use data augmentation to increase the size of our training data many times over.

31. Scraping Information From LinkedIn Into CSV using Python

In this post, we are going to scrape data from Linkedin using Python and a Web Scraping Tool. We are going to extract Company Name, Website, Industry, Company Size, Number of employees, Headquarters Address, and Specialties.

32. Object-Oriented Databases And Their Advantages

Object oriented database is a type of database system that deals with modeling and creation of data as objects. The main advantage of this database is the cons

33. What are the Best Data Analytics Tools?

Data analytics is used for transforming raw data into useful insights.

34. Distributed Storage is the Best Data Storage Tool for The Metaverse

The most suitable data storage tool for Metaverse is undoubtedly distributed storage.

35. A Guide to Importing Smartsheet Data into SQL Server using SSIS

Easily back up Smartsheet data to SQL Server using the SSIS components for Smartsheet.

36. How to Avoid Consumer Lock-in with The Decentralised Web

<em>This is the third blog post in our series exploring aspects of the Arweave’s decentralised, </em><a href="https://www.arweave.org/"><em>permanent web</em></a><em>. You can catch up with the other parts </em><a href="https://medium.com/arweave-updates/building-the-decentralised-web-part-one-the-problem-9766f1987c91"><em>here</em></a><em> and </em><a href="https://medium.com/arweave-updates/building-the-decentralised-web-part-two-the-components-97409d1fe545"><em>here</em></a><em>.</em>

37. 5 Web3 Startups That Deserve Your Attention

I've worked with Blockchain & Web3 startups consistently since 2017. I've seen teams come and go, businesses flourish only to fail, and bull and bear markets prop up, or kill great ideas respectively.

38. Top 10 Data Science Project Ideas for 2020

As an aspiring data scientist, the best way for you to increase your skill level is by practicing. And what better way is there for practicing your technical skills than making projects.

39. Secrets to Growth Marketing Data Engineering – Even in This Down Economy

Marketing is a big business and it's only going to grow bigger. One reason for this is that marketers need to keep growing the list of data points.

40. Ghost in Your Machine

What’s more frightening than Halloween? Data migration.

41. Data Labeling for AI Products: How to Process Thousands of Data Labels

Here are a handful of recent case studies that show the power of data labeling in action.

42. Facebook's Deepfake Challenge That Will defeat Deepfakes. Hopefully.

Nowadays, we are seeing a new wave and great advancements in different technologies. Things like Deep Learning, Computer Vision, and Artificial Intelligence are improving every single day. And Researchers and scientists are having amazing use-cases with these technologies which can change the direction of our world.

43. Building a Serverless Data Pipeline to Analyze Meetup data

Building a Serverless Data Pipeline to Analyze Meetup data

44. Using SPyQL and Python to Run Command Line Analytics

SPyQL combines Python and SQL to make querying of CSV and JSON data easy. In this tutorial we analyse the geographical distribution of cell towers.

45. Learning SQL Can Give You a Major Career Boost

Why learning SQL is a major career boost with LogicLoop

46. 17 Open Crime Datasets for Data Science and Machine Learning Projects

For those looking to analyze crime rates or trends over a specific area or time period, we have compiled a list of the 16 best crime datasets made available for public use.

47. DOCSIS 3.1 Technology: Everything You Need to Know

In this tech guide, we will cover the important details about DOCSIS 3.1 technology.

48. What’s Wrong With GraphQL?

While GraphQL offers several benefits, there are some potential disadvantages and challenges to using it in C# to consider, before you decide to implement it.

49. Data Mapping and What It Means for Business Strategy

Data mapping solutions powered by AI and ML enable users to bridge the differences in the schemas of data source and destination in a target repository.

50. How to Become the Data Whisperer

The data whisperer is the function sitting between the business and the technologists.

51. A Beginner's Introduction to Database Backup Security

With more companies collecting customer data than ever, database backups are key.

52. How Data Teams Can Benefit From Running Like a Product Team

Product teams have a lot of great practices that data teams would benefit from adopting. Namely: user-centricity and proactivity.

53. Why Are Removed Posts Still Visible on Reddit?

Even if moderators delete a post that is breaking the rules of Reddit, it is still very easy to find.

54. How to model an efficient database for your application

What is Database Modeling?

55. Universal Data Tool: Time Series Data and Audio Labeling [Update 9]

If you haven’t heard of the Universal Data Tool yet, it’s an open-source web or desktop program to collaborate, build and edit text, image, video, and audio datasets with labels and annotations.

56. The Pros and Cons of Collecting Online and Offline Data

57. The Fastest Way to Become A Professional Data Analyst

Sharyph, a tech writer, goes over how to become a professional data analyst.

58. Sensor-based Control in Cobots: Its Opportunities and Challenges

Introduction of the very basic formulation of the major sensor-servo problem, and then presenting its most common approaches like touch-based,

59. Why User Testing is Your Competitive Advantage

Give your users time to explore the UI of your latest product offering so you can gain an understanding of how interaction can be improved.

60. 5 Best Website Categorization Tools

Website categorization refers to the process of classifying websites that users come into contact with into various categories.

61. Building A Secure Data Economy: An Interview with Ocean Protocol's Founder Bruce Pon

Ocean Protocol is technology that allows data sharing in a safe, secure and transparent manner without any central intermediary. Using Ocean Protocol, data scientists and artificial intelligence researchers can unlock and analyze big data, while respecting data privacy.

62. The Importance of Hypothesis Testing

Hypothesis tests are significant for evaluating answers to questions concerning samples of data.

63. The Importance Of Data in Sales in 2022

64. Why Data Governance is Vital for Data Management

Both data governance and data management workflows are critical to ensuring the security and control of an organization’s most valuable asset-data.

65. Building an Airtight Security Funnel Step-by-Step

In this article, we’ll walk through SharePass’s patent-pending security funnel, providing a step-by-step guide to building out your security pipeline.

66. Top 3 Benefits of Insurance Data Analytics

The Importance of data analytics and data-driven decisions across the board and in this case insurance data.

67. Decentralized Storage: Confronting the Challenges

Decentralized storage is still far from mature. Three key obstacles - technical, regulatory and adoption - currently stand in its way.

68. How Smart Analytics Can Help Small Businesses Boost Sales

Technology has taken over the world, now is the time for small businesses to realize that what they need is tech. Smart analytics makes everything easier.

69. How to Build a Decoupled Microservice Using Materialize

One way to handle data in microservice architectures is to use decoupled microservices architecture. This form of architecture can bring many benefits.

70. Decoding MySQL EXPLAIN Query Results for Better Performance

Understanding MySQL explains query output is essential to optimize the query. EXPLAIN is good tool to analyze your query.

71. The Failed Promises of Extract, Transform, and Load—and What Comes Next

Faster, Better Insights: Why Networked Data Platforms Matter for Telecommunications Companies

72. How to Use Public Keys in Data Lifecycles

The data lifecycle (also known as the information lifecycle) refers to the full-time period during which data is present in the system.

73. Kafka Authorization And NiFi Encryption to Amazon S3

Any typical ETL/ELT pipeline cannot be completed without having "kafka" keyword in the discussions.

74. Useful Resources for Data Structure & Algorithm Practice

These four resources may be useful for learning about data structures and practicing making algorithms for your advanced programming needs in your work.

75. Benefits of Corporate Data Backup and Best Practices to Keep in Place

Nowadays, companies are increasingly relying on corporate data backup solutions to guarantee the safety and recoverability of their data. Read on to learn more

76. A Guide to Web Scraping With JavaScript and Node.js

With the massive increase in the volume of data on the Internet, this technique is becoming increasingly beneficial in retrieving information from websites and applying them for various use cases. Typically, web data extraction involves making a request to the given web page, accessing its HTML code, and parsing that code to harvest some information. Since JavaScript is excellent at manipulating the DOM (Document Object Model) inside a web browser, creating data extraction scripts in Node.js can be extremely versatile. Hence, this tutorial focuses on javascript web scraping.

77. Using User Data After Google's Third-party Cookies Ban

Google announced that it would ban the usage of third-party cookies; it has made a lot of publishers afraid that they won't be able to utilize user data.

78. Data Lakehouses: The New Data Storage Model

Data lakehouses are quickly replacing old storage options like data lakes and warehouses. Read on for the history and benefits of data lakehouses.

79. How to Migrate Data from an MSSQL Server to PostGreSQL?

Thinking of shifting to a new database management engine? Here's how to migrate data from SQL server to PostgreSQL.

80. How to Efficiently Manage Queues in SQL Databases

A queue using an SQL-database? well, you need to know pros and cons, and a typical implementation.

81. How To Solve the Problem With Key Metrics In a B2B Product

To learn how B2B companies solve the problem with key metrics in a product, I caught up with Yuri Brankovsky who has worked in multiple digital products.

82. SubQuery to Make Blockchain Data Easily Accessible on the Cosmos Blockchain

SubQuery is a blockchain developer toolkit that allows for web3 infrastructure through a custom open-source API between data and decentralized applications.

83. Data Journalism 101: 'Stories are Just Data with a Soul'

Gone are the days when journalists simply had to find and report news.

84. 'At the Coalface of Implementing Data Stacks': kleene's Co-founder & CEO Andrew Thomas

2-minute look at the building of kleene.ai through a founder's eyes.

85. 9 Data Trends You’ll See in 2023

2022 saw the data space grow by leaps and bounds. Here are the top 9 things our team of data experts expects to see in 2023.

86. Make Data-Driven Decisions With Power BI Consulting & Implementation

Power BI offers a solution for businesses that need to manage large volumes of data. It's designed to help with even the heaviest data flows business have.

87. Scraping Glassdoor Job Data

Glassdoor is one of the biggest job markets in the world but can be hard to scrape. In this article, we'll legally extract job data with Python & Beautiful Soup

88. AI Is Making Our Concrete Buildings And Bridges Safer

AIs application to civil engineering and concrete construction is the future of structural safety. There have been various successful & innovative applications.

89. Proven Metrics and Important KPIs for Startups to Measure Success

What are the most important KPIs for startups to measure success? Find your answer in this article and learn what key product metrics to track to enable growth.

90. What is an API, Simply Explained

Connectivity is something amazing. Right now, we are used to use our computers or phones to buy, post, watch, etc. We can do lots of things actually. We are connected to the world and to each other.

91. What is a Citizen Data Scientist and How Do You Become One?

Data science has been democratized for the most part. AI is now mainstream! It's no longer the exclusive province of large companies with deep pockets.

92. What's in Store for Privacy and Personal Data Protection in 2022?

2021 saw many advancements in internet privacy, what does 2022 have in store?

93. The Black Market for Data is on the Rise

Once the laughingstock of the Internet, hackers are now some of the most wanted criminals in the world.

94. A Tale of Two Cities: Economic vs Digital Democracy

More than new laws and fines, we need to reconsider data ownership as a whole and discover new structures that place control back into the hands of the people.

95. An Overview of Cyber Insurance for MSPs

Cyber insurance is a type of insurance policy designed to protect businesses and individuals against losses resulting from cyber attacks and data breaches.

96. 8-Ways Data Mining Can Improve your Business

If your company is trying to make sense of the customer data, here’s a not-so-surprising fact for you. You aren’t alone. Far too many companies want to understand data and gain an in-depth insight into the information they are sitting on. Let’s be clear that today, the success of a business lies in how efficient their data mining process is. Their expertise to process the available data as this can help them to decipher age-old questions that make or break them:

97. Metrics, logs, and lineage: 3 Key Elements of Data Observability

Data observability is built on three core blocks: metrics, logs, and lineage. What are they, and what do they mean for your data quality program?

98. Join to Write Data Into Your First Decentralized Database

The DB3 Network is a start-up project to build a decentralized, permissionless platform for programmable data processing.

99. What is Data Analytics and How It Can Be Used

WHAT IS DATA ANALYTICS?

100. Five Common Reasons Why Data Integration Projects Fail

It’s 3 AM. My alarm goes off and I groggily climb out of bed and crack open my laptop. One of our biggest customers needs their data delivered by 9 AM, and I’m getting up before sunrise to triple-check every data point before their delivery. Our data platform was built with hundreds of data audits, but this customer’s delivery was just too complex to feel 100% confident that we’ve captured all potential issues. This scenario would soon become a typical morning for me. Wake up. Coffee. Pray to the data gods for an inbox without 500 Zendesk ticket escalations.

101. How to Leverage Predictive Analytics in Your eCommerce Businesses

Predictive analytics is able to predict which customers are most likely to churn or which products are most likely to be returned. Here are 6 other use cases.

102. What is Digital Footprint Management?

Your digital footprint refers to the trail of information you generate when creating, sharing, or storing any digital data.

103. Reimagining Marketing Teams: Why Creatives Should Embrace the Tech Side of the Business

With the pace of the business world, it is increasingly important for marketers to embrace new technological breakthroughs to stay ahead of the curve.

104. Diffusion by Push Technology Now Supports MQTT

Support for the OASIS MQTT open standard protocol is the main feature added to Diffusion 6.6 Preview 2, the latest release of the Diffusion® Intelligent Event Data Platform.

105. Best Practices For Backend Data Security

Backend data security relies in encryption, access control, data backup and other such features to exist. These best practices are intended for the backend.

106. IoT in Smart Buildings – A Behind-The-Scenes Picture on Data Utilization

IoT and smart buildings are all about data, but how is all this data used and what kind of data can you get? Read Haltian's article about smart buildings & data

107. SocialFi — Social Networks on the Blockchain & What to Expect From Web 3

How do social networks of the future differ from the usual ones, and what projects to expect in 2023.

108. The Noonification: Immigrant Teens Are Working Dangerous Night Shifts in Factories (11/21/2022)

11/21/2022: Top 5 stories on the Hackernoon homepage!

109. Mastering NumPy Arrays(Part 1): Stacking and Splitting

A comprehensive guide for NumPy Stacking. How to stack numpy arrays on top of each other or side by side. How to use axis to specify how we want to stack arrays

110. Web3, Data, and the Issue of Self-Sovereignty

Whoever owns your data, owns your decisions.

111. 6 Biggest Differences Between Airbyte And Singer

We’ve been asked if Airbyte was being built on top of Singer. Even though we loved the initial mission they had, that won’t be the case. Aibyte's data protocol will be compatible with Singer’s, so that you can easily integrate and use Singer’s taps, but our protocol will differ in many ways from theirs.

112. Encoding Categorical Data for ML Algorithms

Encoding is a technique used to convert categorical data to numerical representations to be able to use the data in machine learning algorithms.

113. Self-service Data Preparation Tools Can Optimize Big Data Efficiency for the IT Team

Self-service data preparation tools are designed for business users to process data without relying on IT, but that doesn’t mean IT users can't benefit too.

114. What installing the Messenger app tells us about Facebook

Messenger’s onboarding is a great case study of manipulative design

115. Top 20 Twitter Datasets for Machine Learning Projects

It is often very difficult for AI researchers to gather social media data for machine learning. Luckily, one free and accessible source of SNS data is Twitter.

116. What the 2020 Toilet Paper Shortage Can Teach Us About AI

AI-driven technology expedites this process and helps companies meet consumer demand, cater to consumer concerns, and personalize the consumer experience.

117. Quantum-resistant Encryption: Why You Urgently Need it

The Second World War brought to the front burner the world of espionage, which is the precursor of cybersecurity, as is seen in the modern world. Technological advancements such as the quantum computer necessitate that we take the war against cybercrimes to another level.

118. Jetpack DataStore in Android Explained

The JetPack Datastore is an Android data storage solution that is helpful when making Android-based mobile apps by providing a way for data to be retrieved.

119. Going Beyond "Have You Tried Unplugging It and Plugging It Back In?" Taking IT to the Clouds

In this new era of digital transformation, there isn’t really a good excuse for companies that claim they want to succeed but aren’t willing to invest in employ

120. How to Query Deeply Nested JSON Data in PSQL

Recently I had to write a script, which should’ve changed some JSON data structure in a PSQL database. Here are some tricks I learned along the way.

121. How Data-Driven Coaching Helps Employees Reach Their Potential

Data is everywhere. In the business world alone, we use it to track search engine traffic, monitor website activity, land sales, improve customer service.

122. On the difficulty of creating a data science code of ethics

undefined

123. The Growth Marketing Writing Contest: Round 1 Results Announced!

Growth marketers - the wait is OVER. The first round results announcement of the Growth Marketing Writing Contest is now LIVE!

124. Decoding MySQL EXPLAIN Query Results for Better Performance (Part 2)

Understanding MySQL explains query output is essential to optimize the query. EXPLAIN is good tool to analyze your query.

125. Facebook and Anti-Abortion Clinics Have Your Info

Facebook is collecting ultrasensitive personal data about abortion seekers and enabling anti-abortion organizations to use that data

126. Why the Gaming Chip Shortage in the Gaming Industry is not Game Over

The global chip shortage has taken the gaming industry by storm, as it is one of the biggest industries most affected, and the resupply of consoles can last unt

127. Data Loss Prevention: What is it, and Do You Need it?

Data Loss Prevention is a set of tools and practices geared towards protecting your data from loss and leak. Even though the name has only the loss part, in actuality, it's as much about the leak protection as it is about the loss protection. Basically, DLP, as a notion, encompasses all the security practices around protecting your company data.

128. Watch Out for Deceitful Data

Nowadays, most assertions need to be backed with data, as such, it is not uncommon to encounter data that has been manipulated in some way to validate a story.

129. What Is Modern Business Intelligence?

This article gives insight into some basic features and functionality that a desirable modern BI software has and illustrated some examples.

130. What is RFM (Recency, Frequency, Monetary) Analysis?

RFM analysis is a data-driven customer segmentation technique that allows marketing professionals to take tactical decisions based on severe data refining

131. Exporting Data to Fit Your Needs

A lot of the work we do at ChartMogul centers around how we display and present your data in a clear and transparent way.

132. Understanding the 'Data is the New Oil' Analogy

Earlier, we lived in industrial and post-industrial societies, and gas and oil were the only things of value. Now, it’s the age of information society and data has replaced petrol as the economy’s driving force. The reason is that with the help of Big Data, people significantly improve production efficiency and business economics. That’s true.

133. From 1999 to 2020, Google Grew from 10k to 4.6B Daily Searches

The Internet Live Stats graph above pictures Google's first 13 years. Today they report 4,517,847,993 DA(internet)Us currently do 4,781,309,755 daily Google searches, according to Internet Live Stats.

134. AI and Crowdsourcing: Using Human-in-the-Loop Labeling

With human-in-the-loop data labeling, humans and machines complement each other, which results in simple solutions for a variety of difficult problems at scale.

135. Have You Read Your Privacy Notice in Detail?

Do you recall every company you have given consent to use your data as you browse a website or sign-up to a ‘free’ service? It's time we moved beyond consent

136. How to Make Sure Your Nonprofit is Complying With HIPAA

It is important for your non-profit organization to comply with HIPAA to protect health data. Here's how you can do so.

137. Blockchain Technology Improves Data Authentication and Transparency in Healthcare

Blockchain is the secret to trusting the data as it moves into our healthcare ecosystem.

138. Using a Relational Database to Query Unstructured Data

Using Relational Database to search inside unstructured data

139. How Data Scientists Start Automating Their Tasks With Python

Introduction to automation with python and my top 3 most used code snippets.

140. Trino: The Open-source Data Query Engine That Split from Facebook

If you want to accelerate Trino queries with a response time of seconds to minutes, click here to learn how Trino helps engineers.

141. NLP Datasets from HuggingFace: How to Access and Train Them

The Datasets library from hugging Face provides a very efficient way to load and process NLP datasets from raw files or in-memory data. These NLP datasets have been shared by different research and practitioner communities across the world.

142. Principles of a Clean Relational Database

The article describes how a relational database should be designed to properly work in OLTP mode.

143. Behavioral Analytics: The Foundation of Targeted Marketing and Predictive Analytics

Learn how to capitalize on your business standards and increase the conversion rate by approximately 85% by analyzing customer behaviors with data you collect.

144. The hidden risk of ethics regulation

Regulating the tech industry won’t fix its ethical problems, it might make them worse. Mike Monteiro has written the most compelling argument I have seen for regulation. Regulation would address many of the kinds of ethical risks that have made headlines recently. But I think it would leave many risks in place and introduce new risks — a more systemic risk, in fact — that in the long term would actually expose the public and the industry to more potential downside that it currently faces. Regulation at scale requires rules that stipulate what is ethical and what is not, in the case of the discussion of the ethics.

145. 6 Tips to Get More Value Out of Your Microsoft Power BI Dashboard & Reports

By using Microsoft Power BI, you increase the efficiency of your company through its interactive insights and visual clues. Here are 6 tips for Power BI users.

146. The Noonification: The Idea of “Safe Cex” Should Stay in 2022 (1/20/2023)

1/20/2023: Top 5 stories on the Hackernoon homepage!

147. Defining the Problem in Your Data Science Project Can Lead to Success

Defining the Data Science Problems the right way is hard work. The failure rate of various data science initiatives is really high — often ~70-80%.

148. How 5 Massive Data Breaches Could Have Been Prevented

One of the biggest losses for companies? Inadequate cybersecurity.

149. A High Level Explanation of Data Types for Decision Makers

There are three different types of data: structured data, semi structured data, and unstructured data.

150. Brace Yourself: Data Cleanup is Coming

It goes without saying that data is the cornerstone of any data analysis.

151. Fintech Should Focus On Long-Term Vision, Not Short-Term COVID Buzz

The COVID19 crisis has been playing out globally for over half a year, or almost a year counting its early phase in China. It’s been hurting a lot of sectors, but one particular sector stood to benefit — Fintech.

152. Automated Offline Backups Can Save the World

Ransomware is worse than malware: Systems and data are all locked up, and backups are all encrypted, too.

153. Machine Generated Whiskey

Thanks to Microsoft, and a lot of whiskey data.

154. Debugging My Love Life

Tinder's "Top Spotify Artists" feature is relatively shallow, but could be fixed easily. Here is a demonstration of how it works currently and what can change.

155. Software Development Tricks Coding for Beginners and More

This week on HackerNoon's Stories of the Week, we looked at three articles that covered the world of software development from employment to security.

156. Building an Efficient AI Platform for Data Preprocessing and Model Training

Lei Li, AI Platform Lead, and Zifan Ni, Senior Software Engineer from Bilibili, share how they increased the training efficiency on their AI platform.

157. Denial Of Service (DoS) Attacks: Nature And Method Of Infection

Denial Of Service or DoS attacks work by overloading the target host’s bandwidth, preventing other users from accessing the affected server, denying service.

158. The Three Basic Benefits of a Virtual Data Room

The popularity of online virtual data rooms has increased over the years. These are innovative software used for safe storage and sharing of files. As the world is modernizing, people are using advanced technology to carry out their daily tasks. As everything today is digital, it becomes more and more crucial to look for new methods to store files. Gone are the days when people used to pile up hard copies of all the files in the offices. Some people are still seen doing that which wastes half of their time. Imagine you have a business meeting in some time and you can’t find a specific file because there is a huge unorganized bundle of files in your office. With virtual data rooms, all your files are well organized. You do not have to get into a hassle of finding a certain file. With just one click, the file appears in front of you in no time.

159. Meet Data: The Driving Power of Fintech

Off late, “Fintech” has been and remains to be a buzzword. It is transcending beyond traditional banking and financial services, encompassing online wallets, crypto, crowdfunding, asset management, and pretty much every other activity that includes a financial transaction. Thereby competing directly and fiercely with traditional financing giants and their methods.

160. Executing a T-test in Python

In today’s data-driven world, data is generated and consumed on a daily basis. All this data holds countless hidden ideas and information that can be exhausting

161. Artificial Intelligence and Big Data

Artificial Intelligence and Big Data. These two terms seem to permeate the tech world in every possible way one can think of. Along with giant terms like Machine Learning, IoT, blockchain and related ones, AI and Big Data are set to dominate our world in the years ahead.

162. Public Health Improvements as a Result of Data Usage and Analysis in Healthcare

Big data has made a slow transition from being a vague boogie man to being a force of profound and meaningful change. Though it’s far from reaching its full potential, data is already having an enormous impact onhealthcare outcomes across the world — both at the public and individual levels.

163. Can Data Automation Transform The Workplace?

Every minute, a staggering 1,820 terabytes of data is created around the world. That’s more than 2.5 quintillion bytes every day! This data takes many forms, from Tweets and Instagram posts to the generation of new bitcoin.

164. If You Have Important Data: Make Sure Its Protected

Data transfer is very important and it keeps happening almost every minute. As we chat on various social media applications or even like a post, there is a transfer of information that is happening. While we may not be too bothered about the way in which information and data are transferred from the receiver to the sender and vice-versa, we, of course, would be concerned about the safety of the data and information that is flowing on the internet and other forms of communication.

165. Why Python Is Leading the Charge in Data Analytics

Python is one of the oldest mainstream programming languages, which is now gaining even more ground with a growing demand for big data analytics. Enterprises continue to recognize the importance of big data, and $189.1 billion generated by big data and business analytics in 2019 proves it right.

166. Building an AI Red Team to Stop Problems Before They Start

An incredible 87% of data science projects never go live.

167. Hacking Emotions: Interpreting Current Events through News Sentiment

How can organizations best measure news sentiment to gain insights about customer and investor behavior?

168. Why co-location is the best way to mine bitcoin

Since the recent Bitcoin halving event, most small and medium crypto miners have had to shut down their mining rigs. Simply put, it is not profitable to have a mining rig in your home at current market prices. However, there are some solutions to the issue.

169. 4 Critical Steps To Build A Large Catalog Of Connectors Remarkably Well

The art of building a large catalog of connectors is thinking in onion layers.

170. Google Analytics Heartbeat Data Visualization

An experiment in real-time data visualization

171. Ethereum Merge: “15 Days Before and After” Data Analysis, Сensorship in Ethereum Blockchain

In this article, I will analyze what actually happened, taking as a basis 15 days before and 15 days after the transition.

172. Trends Uncovered by Scraping OpenSea Data to Analyze NFT Collections

Web Scraping OpenSea to get NFT data and trade history about the NFT collection The Bored Ape Yacht Club

173. Application Programming Interface (API): What it is and How to Use it

APIs are less like USB ports or fire hoses than they are as a person at a help desk in a foreign country. An API will not give you all of a program’s information or code (like a fire hose), because what would stop you from replicating the entire code base? Instead, an API provides you with data its programmers have made available to outside users. Even so, you have to know the language and ask the right questions to do anything with this data.

174. Kimball & Inmon vs. the Retail Store

Years back I had read a blog about database scalability where it simplifies definition of scalability with activities in a kitchen. I was quite surprised how successful the comparison was. Come to think about it, technology is and should be inspired by what’s happening around us. This thinking pushed me into thinking and linking technology with my everyday life.

175. The Noonification: Internet Archives Silent Killer (2/2/2023)

2/2/2023: Top 5 stories on the Hackernoon homepage!

176. So You Just Became a Data Science Manager... Now What?

With the rise of data science there has been the rise of data science managers. So what do you need to keep in mind if you wish to join these data translators that are acting as a conduit between the business and technical data teams? Going from a practitioner to a manager — your job now is to make sure that data resources are being used optimally so how do you go about doing this effectively?

177. Using a REST API with Python

Requesting fitness data (backlog) from Terra requires HTTP requests, so I’m writing an essential guide here on using a REST API with Python.

178. The Retail Evolution: Customers Demand Enhancements to the Shopping Experience

Andreas Hassellof, Founder and CEO of Ombori, explores how changing customer behaviors impact retail, and drive retail technology innovation.

179. A Technologist Manifesto against Data Imperialism

Tactical Mission

180. [Everyday Tech Solutions] Turning Feedback Data into Actionable Advice

If you're working on something that users actually use, then you're most likely also acquiring data en masse. When it comes to free text feedback, this data might get lost or stay in the hands of some analysts. How to take a few easy steps, to turn that data into actionable steps instead.

181. 5 Reasons Why VPNs are not Safe in 2021

All good things must come to an end, which may be true for the VPN in 2021. VPNs have been a useful enterprise tool for companies since they started in the 90s,

182. How Can You Minimize Your Online Footprint

You may be shocked to find out what information is available about you and how it could be used. Here are steps you can take to minimize your online footprint.

183. Web Scraping Using Node.js

While there are a few different libraries for scraping the web with Node.js, in this tutorial, i'll be using the puppeteer library.

184. 3 Types of Tools Needed For Effective Project Management, Remotely

Project management is perhaps the most crucial job in any organization. This is because the success of every project is directly proportional to slowly reaching towards the goals and objectives of the organization that were established well in advance.

185. Can Your Organization's Data Ever Really Be Self-Service?

Self-serve systems are a big priority for data leaders, but what exactly does it mean? And is it more trouble than it's worth?

186. Tableau Vs. Power BI: The Complete Comparison

The world of analytics is continually evolving, introducing new goods and adjustments to the modern market. New companies are entering the market and well-know

187. Fenwick Tree Explained

Fenwick Tree is an interesting data structure that uses binary number properties to solve point update and range queries in your code in some situations.

188. Using Machine Learning to Build a Ride Acceptance Model for Uber

Objective: Predict if a driver will accept a ride request or not and find the probability of acceptance.

189. Humans, Data and Emergent Factors

Emergent factors refers to the factors that can arise from the interaction of a group of agents or a system.

190. Building a Propensity Model to Target Users Better in Marketing Campaigns

Propensity model to figure out the likelihood of a person buying a product on their return visit. We need to identify the probability to convert for each user.

191. What Happens When You Get Sick Right Now?

We are living in a weird time. Day by day we see more & more people coughing and getting sick, our neighbors, coworkers on Zoom calls, politicians, etc… But here’s when it becomes really, really scary — when you become one of “those” and have no clue what to do. Your reptile brain activates, you enter a state of panic, and engage complete freakout mode. That’s what happened to me this Monday, and I’m not sure I’m past this stage.

192. Is Cloud Computing Really More Sustainable?

We've all heard the environmental benefits of cloud computing, but there are some cons as well. Is the cloud really more sustainable?

193. Facebook Peeked at Your Info When You Applied for Student Aid Online

For millions of prospective college students, applying online for federal financial aid has also meant sharing personal data with Facebook, unbeknownst to them.

194. A How-to Guide for Data Backup and VM Modernization

Data is everywhere it is something that we all rely on. It is used by individuals and large organizations that collect and store hundreds of files a day.

195. Merging Datasets from Different Timescales

One of the trickiest situations in machine learning is when you have to deal with datasets coming from different time scales.

196. From 2000 to 2018, Total Websites Grew From 17 Million to 1.6 Billion

Screenshot from Internet Live Stats. Interestingly, the internet experienced year over year decline in 2010, 2013, 2015, and 2018. Wonder why? Here are more site that source to measure aggregate website growth:

197. 8 Cloud Computing Trends to Watch in 2021

Cloud computing has grown exponentially in the past decade and is not about to stop. As predicted by Forrester’s research, the global public cloud infrastructure will grow 35% in 2021, many thanks to the pandemic. Due to the lingering effects of covid-19 in 2021, the cloud will be the key focus for organizations looking for increased scalability, business continuity, and cost-efficiency.

198. How Blockchain Can Democratize Data Collection And Why You Should Care

Blockchain technology democratizes data collection and gives individuals data sovereignty in the context of social media being able to collect so much of it.

199. Data Science Teams are Doing it Wrong: Putting Technology Ahead of People

Data Science and ML have become competitive differentiator for organizations across industries. But a large number of ML models fail to go into production. Why?

200. Data Science With R Programming — Coding Interview Questions

R is a tool used for data management, storage, and analysis in the field of data science. It has applications in statistical analysis and modeling.

201. Understanding the Differences between Data Science and Data Engineering

A brief description of the difference between Data Science and Data Engineering.

202. WTF is Application Monitoring? Do you Need it?

Now that you've built the swanky software application of your dreams, what’s next?

203. How to Track Form Completions with Google Tag Manager

Setting up a website is relatively easy in 2020. Gone are the days when you had to code the whole thing on notepad and then connect to your host with some additional FTP software.

204. 7 Data Quality Metrics To Prioritize

Having high-quality data can make or break your projects in machine learning or business management. These 7 data quality metrics have the largest impact.

205. How to Use Appsmith, Airtable, and Notion to Build a Video Sorting Tool

According to Forbes, 82% of content generated this year is likely to be video.

206. What Will be the 3 Biggest Software Development Trends of 2022?

The number of software developers globally is due to almost double by 2030, yet InterSystems research has found that more than 8 out of 10 developers currently feel they work in a pressured environment. Creating a better experience for developers is key for inciting innovation, but the current data environment continues to evolve in ways that challenge the experience at every turn.

207. Lets Study the Seattle Airbnb Data

So, recently I started my Udacity Nanodegree on Data Scientist. To be honest the first project speaks about CRISP-DM which is CRoss-Industry Standard Process for Data Mining.Let's leave it apart and start working on what we learn from the dataset.

208. So You Got Data... What Now?

Data, data everywhere…but not enough to decide!

209. How to Deal with Tech Trust Deficit

We’re more dependent on tech and e-commerce than ever before, and customers want to know that brands are protecting their data and privacy.

210. A Quick Guide To Business Data Analytics

For many businesses the lack of data isn’t an issue. Actually, it’s the contrary, there’s usually too much data accessible to make an obvious decision. With that much data to sort, you need additional information from your data.

211. Differences and Applications of Web Scraping and Data Mining

Learn the differences between web scraping and data mining and how to apply them.

212. 10 Common Coding Mistakes Data Scientists Should Watch Out For

A look at common mistakes that data scientists make in the process of service delivery.

213. Self-Sovereign Identity Based Access Controls or SSIBACs: An Overview

A recent academic paper uses Hyperledger infrastructure to conduct access control processes using decentralized identifiers, verifiable credentials, and conventional access control models.

214. Knowing Where and When to Enforce the Uniqueness of Your Data

This article looks at data uniqueness and discusses where it should be enforced. At application level or database level?

215. A Dash of Data, a Spoonful of Intuition

It's important to make informed decisions that positively impact your organization. But how do you know when to rely on data and when to use intuition?

216. 14 IoT Adoption Challenges That Enterprises Need To Overcome

Tech-enabled industries are never short of buzzwords and the latest to join the bandwagon is the Internet of Things. Though this Industry 4.0 solution is available for long, the need for adding smart technology has increased presently among industry leaders.

217. How to Back Up Exchange Online Data

218. A Javascript Queue Structure for Buffered Data

If you work with buffered data such as Audio/Video Frame data, you have no doubt appreciated the features of Typed Arrays that came with ES2017 javascript. The ability to move, duplicate, manipulate blocks of data using object methods is achieved by 'imposing' a dataview on the data blocks. These have made buffered data processing a breeze and fast (avoid slow for-loops and extra code ). A detailed discussion of typed arrays is found here: javascript typed arrays.

219. Opinion: There’s Nothing Wrong With Being Tracked by Google

Why you should be happy about companies collecting your data.

220. Good Data is in the Blood of Trusted Applications

Unfortunately, your app is really only as good as the data that supports it and most of the time, that can be out of your control.

221. On Dynamic Observability and Team Culture with Liran Haimovitch, Rookout CTO

Rookout Co-Founder and CTO, Liran Haimovitch, shares the origin story of their debugging tool, what excites him about the startup life, PLG, and more.

222. How Nutanix VM Works

In the era of enterprise cloud, modern enterprise datacenter must support virtualization with high availability and live VM migration. The traditional storage area networks (SAN) or network attached storage (NAS) doesn’t suit. Instead, they are ideal to manage a logical unit number (LUN). A LUN can be a single disk, an entire redundant array of independent disks (RAID), or disk partitions.

223. How Will Blockchain Fix the Centralization of Data?

“In order to have a standard of value [cryptocurrency] must stand outside all value schemes. It must have value in and of itself."

224. Data Is Now a Luxury Good: Here’s Why (It Shouldn’t Be)

When was the last time you read a privacy policy?

225. The Best 50 Sites to Learn About Data Science

Blogs, they’re everywhere. Blogs about travel, blogs about pets, blogs about blogs. And data science is no exception. Data science blogs are a dime a dozen and with so many, where do you start when you need to find the most valuable information for your needs?

226. Top 13 Data Visualization Tools for 2023 and Beyond

With the enormity of data, data visualization has become the most sought-after method to depict huge numbers in simpler versions of maps or graphs.

227. VMware vCenter Converter Alternatives for V2V?

VMware is commonly used to set up data centers for organisations. If the most commonly used solution doesn't work, some alternatives may work better for you.

228. What to Expect from AI in 2022

AI is too complex and dynamic a technology to be approached one-sidedly, only from the business or IT side. Read the article and find out more

229. Use Up-Sampling and Weights to Address Imbalance Data Problem

Have you worked on machine learning classification problem in the real world? If so, you probably have some experience with imbalance data problem. Imbalance data means the classes we want to predict are disproportional. Classes that make up a large proportion of the data are called majority classes. Those that make up a smaller portion are minority classes. For example, we want to use machine learning models to capture credit card fraud, and fraudulent activities happens approximately 0.1% out of millions of transactions. The majority of regular transactions will impede the machine learning algorithm to identify patterns for the fraudulent activities.

230. Creating a digital-first credit model designed for underbanked micro-businesses with Sean Salas

Camino Financial is an AI-powered Community Development Financial Institution (neo-CDFI) offering affordable credit to underbanked Latinx entrepreneurs.

231. Why Use Pandas? An Introductory Guide for Beginners

Pandas is a powerful and popular library for working with data in Python. It provides tools for handling and manipulating large and complex datasets.

232. How Pastel’s Cascade Stores NFT Data Securely

Cascade is a protocol that enables the storage of NFT data and metadata permanently within a highly redundant, distributed fashion with a single upfront fee.

233. How to Improve Data Quality in 2022

Poor quality data could bring everything you built down. Ensuring data quality is a challenging but necessary task. 100% may be too ambitious, but here's what y

234. Why Should Your Business Adopt Cloud-Based IT Solutions?

Data storage and access have long been a concern for businesses. A ton of data is created daily, and efficient storage is a must to keep track of everything.

235. 10 Best Hugging Face Datasets for Building NLP Models

Hugging Face offers solutions and tools for developers and researchers. This article looks at the Best Hugging Face Datasets for Building NLP Models.

236. Is There a 'GitHub For Data Scientists'?

What if I say that there is a place where you can not only store your Data Science projects but also experiment on them right then and there?

237. The Most Commonly Used SQL Queries by Data Scientists

SQL (Structured Query Language) is a programming tool or language that is widely used by data scientists and other professionals

238. These Companies Are Collecting Data From Your Car

Most drivers have no idea what data is being transmitted from their vehicles, let alone who exactly is collecting, analyzing, and sharing that data...

239. An Intro to AI Powered Product Development

The global product development services industry was close to $8 billion in 2020.

240. Using Automation in the Probate Process

Data-heavy manual processes like probate cases are common throughout the legal industry. Thankfully, automation offers a solution to this common problem.

241. MongoDB vs. DynamoDB: Choosing the Best Database for Your Business

All about MongoDB vs DynamoDB. Explore benefits, and in-depth comparison to find out the best choice for your business app.

242. Intro to Structured Query Language (SQL)

In the post, I used a simple SQL query to explain how certain things work in SQL. I also outlined problems with the query and potential ways to improve the code

243. Why Home Media Servers Are Worth Your Time

Files are getting larger and space for your favorite content can be at a premium. Getting your own server can make storing data so much easier.

244. Why You Should Scrape Data from Social Media Websites for Brand Audit

Social media scraping involves automating the process of extracting data from social media websites such as Twitter and Instagram through web scraping.

245. NAS Data Backup Is Essential For Remote Offices

Network Attached Storage (NAS) is a smart, dedicated data storage system that connects to storage drives, allowing multiple users to collaborate and share data.

246. 3 Ways You Can Build and Update Websites Using Data Pushes

Data is getting more and more accessible and is increasingly being used to inform the way businesses operate.

247. 4 Tips To Become A Successful Entry-Level Data Analyst

Companies across every industry rely on big data to make strategic decisions about their business, which is why data analyst roles are constantly in demand.

248. How to Design a Comprehensive Framework for Entity Resolution

In this blog, we will be looking specifically at the issue of resolving entities (also known as record linkage), as well as discussing a comprehensive framework

249. [Infographic] The State of Conversational AI in 2020

Conversational AI was always poised to take off in 2020. In fact, Gartner predicted that 80% of businesses would implement some sort of conversational interface by the end of this year. With the emergence of COVID-19 came compounded growth for the category - and I wanted to capture just how far we’ve come. So for the conversationally curious out there, I created this infographic that offers a clear depiction of where conversational AI stands at this very moment in time.

250. Election Subversion and Manipulation Is Not New

Technology has had a significant impact on how election campaigns are run, and it has been used in a variety of ways to influence election outcomes.

251. Using Data Attribution Comparison Table in Google Analytics 4

The configuration of Google Analytics 4 is not a walk for those unfamiliar with analytical tools or data setup.

252. How We Use dbt (Client) In Our Data Team

Here is not really an article, but more some notes about how we use dbt in our team.

253. Five Undervalued Data Points for Emerging Businesses

Apparently, data has become more ubiquitous than the stars in the sky. In fact, the amount of data produced daily via the Internet is set to top 44 zettabytes. As you might assume, that’s more data than you could possibly fathom or use.

254. Choosing A Colocation Data Centre That’s Right For You

Data and computer systems are at the heart of most companies, which is why it is paramount that where you store your IT infrastructure meets your needs.

255. What is a 'Data Fabric'?

A Data Fabric is a mix of architecture and technology that aims to ease the difficulty and complexity of managing several different data types.

256. Hacking Your Marketing Campaigns With Data Science

There is a ton of data points generated from each of your business activities today. A simple email blast to a few thousand recipients generates data pertaining to the open rates, click-through rates and conversion. These data points can further be distilled to infer specific information about the audience demographics that find your message appealing, the subject lines that trigger the user to open your emails, the CTAs that work, and so on.

257. How to Build a Web Scraper With Python [Step-by-Step Guide]

On my self-taught programming journey, my interests lie within machine learning (ML) and artificial intelligence (AI), and the language I’ve chosen to master is Python.

258. 4 Best Data Recovery Tools For SD cards, USB Drives, and Hard Drives

Oh no! I lost all my vacation pictures. What do I do now? Is it possible to recover all the deleted files from the SD card? Will I ever get to see my photos from the vacation again?

259. Protecting Yourself From CEO Fraud

Yesterday a friend of mine called me sharing his CEO mailed him asking for his personal financial details with CTA on a shortened URL. He was about to click the

260. What Can Recurrent Neural Networks in NLP Do?

Recurrent Neural Networks (RNN) have played a major role in sequence modeling in Natural Language Processing (NLP) . Let’s see what are the pros and cons of RNN

261. Data Virtualization: How It Works And What Benefits We Can Get From It

In the healthcare sector, data virtualization (DV) is gaining traction. It's still a hot subject, with many leading industry experts hailing it as a game-changer.

262. 🐱 How To Create Your Own Virtual World With CoderDojo☄️

This summer, we will be hosting a virtual CoderDojo meetup via Zoom. 🍿 Noblesville High Schooler Anna 👩🏻‍🦰 selected this lesson for her summer CoderDojo project.

263. The Operational Analytics Loop: From Raw Data to Models to Apps, and Back Again

Over the next decade or so, we’ll see an incredible transformation in how companies collect, process, transform and use data. Though it’s tired to trot out Marc Andreessen’s “software will eat the world” quote, I have always believed in the corollary: “Software practices will eat the business.” This is starting with data practices.

264. Interpretation of Visualizations of Soil Data and Weather APIs

Learn how to visualize and interpret weather APIs and soil data in different graphs using python libraries, and Google Collab.

265. A New Netflix Style Reality Show for People Who Love Data

Seven data professionals gear up to analyze and visualize one of the largest and robust datasets out there to win the title - The Iron Analyst!

266. How to Become a Private Home Trader and 10 Tips to Help You Get There

Trading is a booming sector that today attracts many people. Here are 10 tips to help you succeed as a trader.

267. Data Organization – The Great Differentiator in the Digital Era

In business, efficient processes can make or break an organization. If processes are not executed properly, companies lose time, money, and damage their reputation.

268. A Data-Centric Perspective of ChainLink’s Madness

The crypto market might seem incredibly boring to traders these days with Bitcoin and Ethereum behaving like stablecoins 😉. However, a handful number of crypto-assets are showing an atypical momentum non-correlated with the rest the space. Among those crypto-assets, none is capturing the imagination of crypto-speculators like ChainLink. In the last few days, ChainLink has been regularly hitting all-time highs despite challenging the lack of momentum of the top crypto-assets.

269. Five Data Quality Tools You Should Know

Enterprises ensure their data is accurate, consistent, complete, and reliable, by relying on data quality tools

270. An Internal Email to Tim Cook and the State of Business Intelligence

We get a glimpse into the inner workings of a valuable company and it turns out it's not all sunshine and rainbows.

271. Busting Data Science Myths: "You Need a PhD, Extensive Python Skills, and Tons of Experience"

DJ Patil and Jeff Hammerbacher coined the title Data Scientist while working at LinkedIn and Facebook, respectively, to mean someone who “uses data to interact with the world, study it and try to come up with new things.”

272. 3 Reasons To Connect Data Silos To A CDP

In the current digital age in which we live, population data is of exceptional value. And so, when the data of a company's customers are in unconnected silos, they can be a barrier to success.

273. Are MySQL replications as smooth as you think they are?

What are you actually missing out on in MySQL replication? It appears easy, but to debug the problem caused by it takes a lot of time. So, here's your answer.

274. Online Privacy is Not an Option: It's a Necessity

How the challenge of protecting personal information online led to data protection and privacy laws in the EU and U.S.

275. Data Engineering Tools for Geospatial Data

Location-based information makes the field of geospatial analytics so popular today. Collecting useful data requires some unique tools covered in this blog.

276. What Are The Challenges of Monetizing and Selling Data?

There have been great advancements in monetization opportunities in the last decade, but there are still challenges when it comes to generating big data analyti

277. Poor Data Quality is the Bane of Machine Learning Models

An examination of the importance of data quality, how it can present itself in a dataset, and how it can impact machine learning models.

278. How to Define Data Analytics Capabilities

Disclaimer: Many points made in this post have been derived from discussions with various parties, but do not represent any individuals or organisations.

279. Hospital Websites are Giving Facebook Sensitive Information

A tracking tool installed on many hospitals’ websites has been collecting patients’ sensitive health information—including details about their medical condition

280. Explore Different Ways You Can Use Data Visualization to Help Your Nonprofit

Data Visualization can be a crucial tool for your nonprofit. Figure out when to use and how to use it to improve your organization.

281. Four Novel Machine Learning Methods for Analyzing Blockchain Datasets

Using machine learning to analyze blockchain datasets is a fascinating challenge. Beyond the incredible potential of uncovering unknown insights that help us understand the behavior of crypto-assets, blockchain datasets presents very unique challenges to a machine learning practitioner. Many of these challenges translate into major roadblocks for most traditional machine learning techniques. However, the rapid evolution of machine intelligence technologies has enabled the creation of novel machine learning methods that result very applicable to the analysis of blockchain datasets. At IntoTheBlock, we regularly experiment with these new methods to improve the efficiency of our market intelligence signals. Today, I would like to provide a brief overview of some novel ideas in the machine learning space that can yield interesting results in the analysis of blockchain data.

282. Reasons Why Data Privacy Matters

Data privacy is one of the hottest topics in tech conversation. But what's the deal with it? Is it good? Is It bad? Keep reading to find out.

283. Open-Source Intelligence (OSINT) Use by Governments

In the 1980s, the US military first coined the term ‘OSINT’. Since then, the dynamic reform of intelligence has been beneficial in many different scenarios.

284. Leveraging Data Analytics to Improve Patient Adherence

Role of of pharma analytics to enumerate the factors accountable for falling medication adherence and the increasing role of data analytics and machine learnin

285. 10 Reasons to Get Your Cybersecurity Certification

The set of skills that are mostly expected by the employers can be gained by the cybersecurity certifications, it will prepare you for the diversity needed in the sophisticated areas of cybercrime. So, here are the top compiling reasons for you to pursue the additional cybersecurity credentials.

286. What Should I Do After the Data Observability Tool Alerts Me

We need to start building the best practices across the ecosystem to maximize the value of data observability.

287. Hospitals Remove Facebook Tracker but Questions Still Remain

Meanwhile, developments in another legal case suggest Meta may have a hard time providing the Senate committee with a complete account of the health data.

288. Handling Data Integrity Issues Like a Pro

What do you do if an API you reference sends 200 - OK but an error message? What do you do when a critical column is missing from your Excel upload? Read me.

289. Debezium Introduction: Another Change Data Capture Tool

Building an enterprise data warehouse can be either relatively straightforward or very sophisticated. It depends on many factors, such as the conceptual data model complexity and the variety of source systems. In many cases, applying the Change Data Capture (CDC) approach can make the data integration simpler. Fortunately, there are plenty of CDC tools available in the market, many of which are easy-to-use and affordable, while others are cumbersome and expensive (for what it is).

290. How to Use Tableau Visualization to Make a Covid Risk Model

In this paper, I used data from two different data sources and merged them together in the Tableau layer to perform the data analysis.

291. How a Data Scientist Sees a Deck of Cards

The Data Scientist Creativity Paradox

292. 'Experience is a Double-edged Sword': Kyle Kirwan, CEO of Bigeye

An interview with the founder and CEO of Bigeye, a data observability platform.

293. Top 7 JavaScript Pivot Widgets in 2022

Pivot Charts are useful tools that can be relied on to visualise huge amounts of data. These 7 JavaScript Pivot Widgets are some of the best ways to use them.

294. 3 New Startups That Are Innovating DeFi Data Analysis Technology

Data analysis as a whole is one of the most important industries. Now that DeFi is a full-fledged industry, there is a growing need for valuable data analytics.

295. GraphQL, GraphQuill, and You

Let’s start with the idea of a database, and a basic query. Server taps the database. Server brings back persistent state information that allows an application to update, and maybe a GUI. What’s wrong with this picture? Not much at first, of course. A GET request is the anchor of RESTful architecture, and in some ways the anchor of the web. So basic that fetch syntax defaults to that type.

296. How to Create a Data Analytics Strategy to Grow Your Business

Are you building a Software-as-a-Service platform? Wondering what data is essential for your business? Time for a Data Analytics Strategy.

297. Practical Tips to Improve Customer Experience with Data

According to a report, almost 70% of companies compete on customer experience.

298. Save API Costs With Data-Centric Security

APIs are quickly becoming the front door to modern enterprises. But the API paradigm also comes with various hidden costs around development, management, etc.

299. Why Linux-Based Brands Are So Desirable

I'll start off by dating myself... it was the year 2000. I was in college and the brand new Mini Disk MP3 player had just come out. Superior audio to CD's and the ability to hold hundreds of songs on 1 little disk. Being a broke college kid, it took me about 6 months to make the purchase. Just when I got used to looking cool with my MD player, a wild flash of cool came across the analog airways via a commercial from a company that was only recently regaining its cool with a crappy multicolor desktop PC called the iMac. Of course, I'm talking about Apple. The product was the iPod. I was defeated and nearly threw away my MD player on the spot.

300. Moving From the Flat Earth: Why We Should Switch to Data-Driven Finance

Businesses should switch from linear formulae to data-driven finance. This will allow companies to not only get an immediate revenue boost!

301. How Do You Hack Data Structures and Algorithms? Teach Us Sensei!

Software Engineers are always on the lookout for better, more efficient ways to solve problems.

302. Things to Consider When Looking For Data Science Roles

There is a great demand for data scientists presenting market dynamics that are favourable for the community. More so than your peers in other professions, you will be able to evaluate a company for what it is able to offer you, rather than solely being the one that is being evaluated. So what should you look for when comparing and evaluating data science roles? Here is a list of some commonly known factors plus some less discussed ones that will help you in your evaluation.

303. How To Use Change Data Capture for Fraud Detection

Still relying on overnight processes to drive your decision making? Maybe it’s time to consider an evaluation of your CDC pattern that uses new technology.

304. Common RAID Failure Scenarios And How to Deal with Them

Most businesses these days use RAID systems to gain improved performance and security. Redundant Array of Independent Disks (RAID) systems are a configuration of multiple disk drives that can improve storage and computing capabilities. This system comprises multiple hard disks that are connected to a single logical unit to provide more functions. As one single operating system, RAID architecture (RAID level 0, 1, 5, 6, etc.) distributes data over all disks.

305. The Burgeoning Global Surveillance State - What's Going On?

What is a surveillance state? Privacy International defines it as one which “collects information on everyone without regard to innocence or guilt” and “deputizes the private sector by compelling access to their data”.

306. How Is Data Automation Transforming The Workplace?

Every minute, a staggering 1,820 terabytes of data is created around the world. That’s more than 2.5 quintillion bytes every day!

307. Create A Data Visualization Map Using Mapbox

In this article, we make a map with a software called Mapbox in a few simple steps. This won't involve any coding at all!

308. ANSI X12 EDI Basics: A Guide to the ANSI X12 Standards

ANSI X12 EDI is one of the most important concepts that you must be aware of prior to implementing EDI in your organization.

309. Leveraging Data Science in eCommerce: 7 Projects to Try

As an online retailer, how can you improve your business? Of course through providing a better customer experience. An e-commerce company needs to have a well understanding of the following factors:

310. 4 Data Transformations Made Spreadsheet-Easy

Gigasheet combines the ease of a spreadsheet, the power of a database, and the scale of the cloud.

311. 6 Keys to SaaS Security Posture Management

You're not doing everything you can to protect your SaaS environment if you're skipping one of these: 1. Security policy enforcement, 2. Regular configuration..

312. Live on the Edge of Computing

Data is the lifeblood of any application and any business venture.

313. 5 Ways to Store Market Data: CSV, SQLite, Postgres, Mongo, Arctic

What's the most efficient way to store market data? SQL or NoSQL? Let's compare 5 most common options and find out what is best.

314. What is a Minidump?

Adding minidump support came with a number of technical challenges that we had to address.

315. The Importance Of On-chain Analysis

A look at the importance of on-chain anlysis

316. 7 Q&As About Memory Leaks

317. The Future of the Internet Through the Web 3.0 Lens

Jules Verne, John Brunner, Arthur Clarke, William Gibson, George Orwell — it’s a short list of writers who predicted the future in their books. They’ve written about social and technical changes that will take place in human society. Here we are, facing those changes good or bad.

318. Sberbank-Owned RuTarget Harvested User Data for Months via Google

Google may have provided Sberbank-owned RuTarget with unique mobile phone IDs, IP addresses, location information and details about users’ interests and online.

319. How to Clean and Verify Address Data 'Without Using Code'

Today, data verification has become one of the greatest assets of an organization.

320. 4 Ways Cities Are Utilizing Data for Public Safety

Cities have been using data for public safety for years. What new technology is emerging in public safety, and how does it affect you?

321. Improving Customer Experience Through Personalization With Predictive Analytics

The development of smartphone and computer technologies, and the internet in general, have influenced customers’ default behavior and expectations.

322. Football Data Analysis Using Machine Learning Models Can Potentially Boost Throw-Ins!

“Can machine learning models help improve ball accuracy, precision and retention, leading to scoring after throw-ins?

323. How to Make Rough Estimates of SQL Queries

To do estimates of SQL queries we need to understand how DB works with queries. Let's find out what exactly the db do with queries.

324. Why FHIR Capabilities of Healthcare Data Platform is Critical to Quality and Cost of Care Delivery

The flexibility of interoperability in the healthcare system has enhanced patient-doctor interaction to a great extent.

325. Not So Fast: Valuable Lessons from the FastCompany Hack

When FastCompany's website was hacked recently, it sent shockwaves through the media world, underscoring the importance of routine cybersecurity hygiene.

326. The Power State of Dark Data.

Have you ever heard of “Dark Data”?

327. Using Data Science To Deal With RTOs

Considering how much fraudulent RTOs can cost a business, using data science to mitigate their frequency can help save an e-commerce business money over time.

328. How to Combat Reader Fatigue to your Content Marketing Campaigns

Does your content generate more yawns than leads? Is your content just another copy of 100 similar articles clogging up the search engines? Hopefully not, but even if you think your content has an impact, there is always room for improvement.

329. Data Analysts As Arbiters of Truth

Coming out of college with a background in mathematics, I fell upward into the rapidly growing field of data analytics. It wasn't until years later that I realized the incredible power that comes with the position.

330. How Government Agencies Flex Their Data Science Muscle

From NASA to the NSA, data science is being employed by the governments of every major country to inform policy, provide public services and, in some cases, surveil ordinary people. In the United States in particular, it underpins many of the public sector’s most important functions, whether we citizens are aware of it or not.

331. CivicGraph: An Open Source Versioning Data Store for Time Variant Graph Data

I would like to introduce an open source, Apache 2.0 licensed project of mine: https://github.com/CivicGraph/CivicGraph

332. Microsoft's Power BI for Business: Features, User Experience And Pricing

All you need to know about Power BI Features, Benefits, Use Cases and pricing. A Comprehensive Guide to Power BI and how it stacks up to Microsoft Excel

333. 3 Types of Anomalies in Anomaly Detection

An Introduction to Anomaly Detection and Its Importance in Machine Learning

334. These Shifts Will Shape The Future Of Data Centers

According to Gartner, the spending on data centre infrastructure is supposed to grow 6% in 2021 after a steep decline of 10.3% in 2020.The reduced demand in data infrastructure is expected to come back in 2021 once the workforce gets back to the site, according to Naveen Mishra, a senior research director at Gartner.

335. How to Connect to Salesforce Data in AWS Glue Jobs Using JDBC

Connect to Salesforce from AWS Glue jobs using the CData JDBC Driver hosted in Amazon S3.

336. 10 Best Tactics For Your WooCommerce Store Security

WooCommerce is a great plugin for WordPress to build an online store. With an entire eCommerce ecosystem and a dedicated global community, it has achieved the reputation of an industry standard. Still, this doesn’t mean that nothing c go wrong, especially if you ignore essential security precautions. Here are ten tips on how to make your business (and your customers’ data) safe.

337. Why I Spent Years Writing a Children’s Book on Data Science

I wrote a children's book on data science to inform others who have a hard time understanding data science and machine learning concepts, especially kids!

338. How Big Tech Influences Privacy Laws

The Markup reviewed public hearing testimony in all 31 states that have considered consumer data privacy legislation since 2021 and found a campaign by Big Tech

339. What is Data Collection and What are The Most Important Events to Track

When your company is client-oriented, one of your priority tasks is understanding your clients’ problems and gathering insights on how people use your product and when exactly they benefit from it.

340. This Online Abortion Pill Provider Used Tracking Tools That Gave Powerful Companies Your Data

The trackers notified Google, Facebook’s parent company Meta, payments processor Stripe, and four analytics firms when users visited its site.

341. Creating a Dependable Data Pipeline for Your Small Business

In this article, I will be showing you how to build a reliable data pipeline for your small business to improve your productivity and data security.

342. Apache Airflow: Is It a Good Tool for Data Quality Checks?

Learn the impact of airflow on the data quality checks and why you should look for an alternative solution tool

343. AI-enabled Smart Cities: What to Get Right

Data is the foundation of smart cities. However, to deliver the right solutions, planners must establish sustainable data and AI technology policies.

344. Tired of Dirty Data? It’s Time to Implement a Data Scrubbing Initiative

Raw data coming in from various sources is often inherently dirty data, rife with factual errors, typos and in