What Is Structured Data?

Structured data has a well-defined schema for the information it holds. To give an extremely simple definition, any data that can be presented in a spreadsheet program like Google Sheets or Microsoft Excel is structured data. In this example, data can be represented as rows and columns. Each column represents a different attribute, while each row will have the data associated with the attribute for a single instance. Rows and columns form a table that can be referenced easily. Different tables can be connected—that is, they can be said to be related by the common column present in both tables. If multiple tables are related in succession and combination, this creates a relational database. For instance, the customer, sales, and inventory data of a department store can be considered structured data stored as a relational database. Each customer will have a customer ID, as well as fields for their name, contact number, credit card information, address, etc. The database of customers can be connected to the database of sales, with attributes including the time of purchase, item codes purchased, total amount spent, customer ID, etc. Both the tables will be connected with the common attribute of customer ID. Finally, the sales database can be connected to the database of inventory using the common attribute of item code, effectively interconnecting all three tables into a relational database. Structured data like this is generally stored in relational database management systems (RDBMSes). Databases can be written, read, and manipulated using Structured Query Language (SQL), a language that was developed by IBM in the 1970s to support its mainframe databases (though it was initially known as Sequence English Query Language or SEQUEL). It was so named since it reads pretty much like the English language. SQL in its current form was popularized by Relational Software, Inc. (now called Oracle).

What Is Unstructured Data?

Every piece of data that is not structured data can be classified as unstructured data. It’s estimated that by 2025, 80% of the data we encounter will be unstructured data in the form of text, audio, image, or video 1 . In short, unstructured data is modern data. It’s often: Born digital and unpredictable Always being created and on the move Blended, multimodal, and interoperable Geo-distributed for better protection Unstructured data can have some associated metadata that can, in turn, have a structure. For example, a video can have metadata of video resolution, bit rate, frames per second (FPS), owner of the video, etc. But the video itself is unstructured. When there’s some structured metadata associated with unstructured data, it’s occasionally referred to as semi-structured data. Looking more closely at the example of a YouTube video, some metadata is present, such as the time of upload, date of upload, number of views (partial or full), number of likes and dislikes, etc. But the content inside the video title, the video description, and the video itself is unstructured. It has a qualitative aspect that cannot be captured purely by numbers. The most commonly used database for unstructured data is NoSQL. NoSQL stands for “not only SQL,” indicating that the database can handle a wider range of data beyond the capabilities of SQL databases. There’s no schema or tabular structure for NoSQL databases; it’s just a collection of data grouped together.

Dismiss

4月22日オンライン開催

【緊急提言2026年版】IT インフラ調達の新常識：納期遅延と価格高騰をどう乗り越えるか

半導体不足、為替の乱高下、AI 投資の加速などによる変化を乗り越えるための解決策を提言します。

ご登録

Dismiss

イノベーション

あらゆる AI ビジョンをサポート

統合化・自動化された基盤が大規模なデータをインテリジェンスに変えます。

詳しく見る

Dismiss

6月16日～18日ラスベガス開催

Pure//Accelerate® 2026

データの価値を最大化する方法がわかります。

ご登録

ご相談・お問い合わせ

ピュア・ナレッジ
ビッグデータの基礎
ビッグデータと従来のデータ

ビッグデータ・ビギナーズ・ガイド

構造化データと非構造化データ

データの定義およびデータに対する解釈が、この 10 年間で大きく変わりました。非構造化データの読み取り、保存、分析を行うための新たなツールの普及が一因となっています。

従来、非構造化データは、解釈が困難なことが理由で、十分に活用されていませんでした。新たなテクノロジーによって、非構造化データを理解することが容易になり、さらに、非構造化データという情報の宝庫から貴重なインサイトを引き出せるようになっています。

IDC によると、2024 年までに世界中で作成、取得、コピー、消費されるデータの総量は、毎年 149 ゼタバイトを超え、その多くは非構造化データであると予測されています。非構造化データを分析する能力を構築することで、恩恵を得られます。そのためにはまず、構造化データと非構造化データの違いを理解する必要があります。

両者の違いを簡単にまとめ、より詳細な説明を後述します。

特徴	構造化データ	非構造化データ
データの性質	通常は定量的	通常は定性的
データ・モデル	事前定義。いったん定義され、データが保存されると、モデルの変更は困難。	特定のスキーマは存在せず、データ・モデルは非常に柔軟。
データ形式	使用できるデータ形式は限られている	膨大な種類のデータ形式を使用可能
データベース	SQL ベースのリレーショナル・データベースを使用	特定のスキーマを持たない NoSQL データベースを使用
検索	データベースやデータセット内のデータの検索が容易	構造化されていないため、特定のデータの検索が非常に困難
分析	定量的なデータであるため、分析が容易	ソフトウェア・ツールを利用しても、分析は極めて困難
保存方法	データ・ウェアハウスに保存	データ・レイクに保存

Slide

構造化データとは

構造化データは、保持する情報について明確に定義されたスキーマがあります。非常に単純に定義すると、Google スプレッドシートや Microsoft Excel などの表計算プログラムで表せるデータは全て構造化データです。

この場合、データは行と列で表現されます。各列は異なる属性を表し、各行は単一のインスタンスの属性に関連付けられたデータを持ちます。行と列によって、容易に参照できるテーブルが形成されます。

異なるテーブルを連結することもでき、両方のテーブルに存在する共通の列によって関連付けられていることになります。

複数のテーブルを連続して組み合わせて関連付けることで、リレーショナル・データベースができあがります。例えば、デパートの顧客データ、売上データ、在庫データなどは、リレーショナル・データベースとして保存されている構造化データです。

各顧客には顧客 ID のほか、氏名、連絡先、クレジット・カード情報、住所などのフィールドがあります。
顧客データベースは、売上データベースと接続することができ、購入時刻、購入品のアイテム・コード、購入金額、顧客 ID などの属性を持つことができます。これらのテーブルは、顧客 ID という共通の属性で関連付けられています。
さらに、アイテム・コードという共通の属性を使用して売上データベースを在庫データベースに接続することで、リレーショナル・データベースに 3 つのテーブルを効果的に相互接続することができます。

このような構造化されたデータは、一般的にリレーショナル・データベース管理システム（RDBMS）に格納されます。データベースは、SQL（Structured Query Language）を使用して記述、読み取り、操作することができます。SQL は、1970 年代に IBM 社がメインフレームのデータベースをサポートするために開発した言語で、当初は、SEQUEL（Sequence English Query Language）と呼ばれていました。英語とほぼ同じように読めることからこう呼ばれるようになりました。現在の形の SQL は、Relational Software, Inc. 社（現 Oracle 社）によって広められました。

非構造化データとは

非構造化データとは、構造化されていないデータを意味します。構造化されていない全てのデータが非構造化データに分類されます。2025 年には、扱うデータの 80% がテキスト、音声、画像、動画などによる非構造化データになると予測されています。¹

すなわち、非構造化データはモダン・データといえます。非構造化データには、次のような特徴があります。

本質的にデジタルで、予測不可能
常時生成され、動的に変化する
ブレンド、マルチモーダル、相互運用が可能
地理的な分散により保護される

非構造化データには、構造を持つメタデータが関連付けられている場合があります。例えば、動画には、解像度、ビットレート、1 秒あたりのフレーム数（FPS）、所有者などのメタデータを関連付けることができます。しかし、動画自体は構造化されていません。構造化されたメタデータが関連付けられている非構造化データを、半構造化データと呼ぶことがあります。

YouTube の動画を例に挙げると、アップロードした日時、視聴回数（部分・全体）、評価の数といったメタデータが存在します。しかし、動画自体の内容、タイトルや説明文は構造化されていません。それらは、単純に数字だけでは捉えられないという特徴があります。

非構造化データ用のデータベースとして最もよく使用されているのが NoSQL です。NoSQL は「not only SQL」の略で、SQL データベースのケイパビリティを超えて、より広範囲のデータを扱えることを示しています。NoSQL データベースには、スキーマや表形式の構造はなく、データをグループ化するだけです。

UFFO を利用した非構造化データの保存

非構造化データを活用することで、大きな変革の可能性を秘めた重要なインサイトを提供できるかもしれませんが、それにはさまざまな課題が存在します。ピュア・ストレージの先進的な UFFO ストレージ・ソリューションである FlashBlade は、フラッシュ・ストレージ技術による優れたスピードを提供するだけでなく、あらゆるアーキテクチャを俊敏に拡張する能力を備えています。ご興味をお持ちのお客さまには、ピュア・ストレージの FlashBlade を無料でお試しいただけるテスト・ドライブをご用意しています。