At VIM AEC we process extremely large architectural models and import meshes and BIM data into game engines (such as Unity), 3D editing tools (such as 3ds Max), and applications running on different devices (such as the Magic Leap).
When using FBX as a format for representing large meshes it takes close to four minutes to import a 28 million polygon model in Unity. As an alternative we developed an open format for representing 3D geometry called G3D which reduced import time to just under 6 seconds.
The G3D format can represent triangular and quadrilateral meshes, point clouds, line segments, and polygonal meshes along with arbitrary attributes (e.g. normals, UVs, colors, smoothing groups, etc.) associated with different sub-elements of the mesh (vertices, faces, face-corners, polygon groups, or whole object).
Our G3D loader is over an order of magnitude faster in our tests than reading OBJ, FBX, or PLY files with the Assimp model import library or importing into Unity. The specification and reference implementation is also considerably simpler than other binary formats such as glTF while supporting a wider range of data attributes.
On the G3D Github repo we provide a reference implementation in C# using .NET Standard 2.0 (making it cross-platform) and a Unity test project for importing and exporting G3D meshes. We also have released a Nuget package which we use in our production code.
Motivation
The common data formats for representing 3D geometry (e.g. OBJ, FBX, Collada, glTF, etc.) generally have one or more shortcomings:
Our goal was to design a format that would enable us to quickly and easily implement importers and exporters as plug-ins to different editing tools, game engines, and applications running on different platforms from desktop to WebGL to spatial computing devices like the Magic Leap.
We also wanted this format to be able to transport a wide variety of data from different sources without loss.
The reason the G3D format is so fast is that it doesn’t have to pre-process data before getting it into a renderable, GPU-friendly, state.
Text-based mesh formats such as OBJ, PLY, Collada, etc. require the computer to spend a significant amount of time converting from a text-representation to a binary representation consumable by the converter.
Similarly for some binary formats, such as FBX, the mesh vertex data is organized as a vectors of 4 double-precision floating point values. Most rendering contexts expect the vertex position data to be encoded as vectors of 3 single-precision floats, which means that a pre-process step is required to truncate the data and pack it in more convenient format.
G3D data buffers are strictly aligned, so that once the entire file is loaded into memory, the individual data buffers can be passed to the GPU as-is with no additional processing or memory allocations.
Commonly a 3D mesh consists of a series of points in space (called vertices) and a list of faces that specify how those points are connected to make a faceted surface in 3 dimensional space.
Image from Wikipedia licensed under CC-SA-3.0 by RChoetzlien
The list of points is often called a vertex buffer, and the list of faces is represented as an index buffer. If the size of each face is fixed (e.g. 3 for triangles or 4 for quads), then every N indices are the indices of the vertices of the corners for a different face. Given this observation it follows naturally that point clouds and line segments are degenerate cases of faces where there are only one or two points per face respectively. Meshes supporting mixed size polygons can be encoded using an additional data array for the sizes of each face.
There are two types of edges to consider: undirected edges, and directed edges (also called half-edges). In G3D, like in 3ds Max, the edge refers to the half-edge.
Half-edges are so named because two adjacent faces each have a directed half-edge that both share the same vertices, but flow in different directions, completing the full edge that separates the two faces. Enumerating half-edges is simpler, because every face has N half-edges, where N = the number of points in the face. So the half-edge index buffer is exactly the same as the index buffer.
A surface or polygon group is a sequence of contiguous faces which make a continuous surface. Often this surface will share a common material. Consider the case of a cylinder, the curved section can be thought of as one curved face, while the end caps are both surfaces.
Having polygon groups enables more compact data representation (e.g. we can have a shared material ID per group, or a single shared normal in the case of a planar surface). Polygon groups can also be used to encode “sub-mesh” data from Unity or a Revit API Face.
The G3D mesh format is based on representing geometry as a collection of strictly aligned binary arrays called attributes. An attribute is an array of scalars or vectors associated with vertices, face corners, faces, or polygon groups. In some cases an attribute is a single value associated with the entire object. Some common examples of attributes include:
Each attribute in a G3D is associated with a descriptor, which is encoded as a string in the following format:
g3d:<association>:<semantic>:<data_type>:<data_arity>
The descriptor consists of the following components:
Multiple attributes may share the same descriptor string, such as the multiple UV channels stored in a Unity mesh.
The underlying binary layout of a G3D file conforms to the BFAST serialization format, simple and efficient open binary format for serializing collections of named byte arrays. (See this article on Hackernoon)
The first named buffer in the BFAST container is reserved for meta-information about the file encoded in JSON format. It has the name “meta”. There are currently no restrictions or requirements on what data is encoded in the meta JSON object.
Each subsequent data buffer stores attribute data, and uses the attribute descriptor string as a name. As per the BFAST specification attribute data is stored in 64-byte aligned data-buffers.
Attributes also have a “semantic” tag which is used to identify what role the attribute has for rendering or processing. An application may use any string it chooses to represent the semantic, however the G3D specification recommends a number of semantics with predefined meanings (see the Github readme.md for the full list). This is intended to make it possible for different applications to exchange, process, and render G3D data in a common manner.
If you are interested in understanding the design decisions that lead up to the G3D format, it can be useful to read up on how other mesh representations are designed:
At VIM AEC we are very interested in helping others adopt the G3D format, as we believe it can help improve the quality of graphics software for everyone. Please reach out to me via email or the Github issues to ask questions or make suggestions.
Disclaimer: I work at VIM AEC as Head of Research.