Using Scanned Mesh Data for Auto-Digitized 3D Modeling: Methodology

Authors: (1) Ritesh Sharma, University Of California, Merced, USA rsharma39@ucmerced.edu; (2) Eric Bier, Palo Alto Research Center, USA bier@parc.com; (3) Lester Nelson, Palo Alto Research Center, USA lnelson@parc.com; (4) Mahabir Bhandari, Oak Ridge National Laboratory, USA bhandarims@ornl.gov; (5) Niraj Kunwar, Oak Ridge National Laboratory, USA kunwarn1@ornl.gov. Table of Links Abstract and Intro Related Work Methodology Experiments Conclusion & Future work and References 3 Methodology We compute floor plans in four main steps (see Algorithm 1). First, a user captures the interior of the building as a triangle mesh using an augmented reality headset. The mesh is oriented to align with primary axes, and the building is divided into stories. Floors and ceilings are removed, and flat walls are detected if desired. Finally, one of two floor plan styles is generated by slicing and projecting the resulting 3D model. Next we describe these steps in detail. Data collection Indoor environments can be captured in various formats using different devices. We use a Microsoft HoloLens 2 headset to capture triangle mesh data, annotating the mesh using voice commands. Capturing the triangle mesh The HoloLens provides hardware and software to create a 3D representation of the indoor environment using triangles, as shown in Figure 1-left. The headset overlays the triangles on the user’s view of the building interior. Although the headset captures most of the walls, floors, and ceilings, data may be missing from some regions, as in the figure. Annotating the mesh To capture object positions such as sensors, thermostats, windows, and doors, we developed an augmented reality (AR) user interface. This interface uses eye gaze detection and voice commands to enable users to place synthetic objects at desired locations, as shown in Figure 1-right where a synthetic sensor object is added to the immersive environment, superimposed on a physical sensor. 3.1 Mesh orientation After the annotated triangle mesh is captured, geometric processing is performed. Initially, the mesh’s orientation is based on the user’s starting position and gaze direction. To generate a floor plan, we must determine the floor’s position and facing direction. The AR headset provides a rough estimation of gravity direction, but additional computation improves precision. Orienting the floor To determine the mesh orientation, we tested two methods: (1) compute the shortest edges of the mesh bounding box, and (2) cluster the facing directions of mesh triangles using spherical k-means. Method (1) works for buildings with constant altitude and large floor area, but it fails on others, so we mainly use Method (2), described in Algorithm 2. Algorithm 2 applies to a broad range of meshes, including multi-story buildings with vertical dominance. It uses the surface normal vector of each triangle ∆ in the mesh M and filters out triangles deviating significantly from the positive y direction, preserving those likely to represent the floor (∆ ′). We use a spherical coordinates k-means algorithm with k = 1 to find the dominant direction gm of these triangles. We discard triangles that are more han an angle ϕ from the dominant direction and repeat the k-means algorithm until ϕmin is reached (e.g., start with ϕ = 30 degrees and end with ϕmin = 3). This gives an estimate of the true gravity direction gt. To orient the mesh, we compute the angle θ between gt and the negative yaxis and determine the rotation axis Y by taking their cross product. We rotate the mesh by θ around the Y axis, ensuring a horizontal floor. Further details on this method for floor orientation are in Algorithm 2. Figure 2 shows a model where the floor is not level, but tilts down from near to far and from right to left. After Algorithm 2, the floor is horizontal. Finding the height of the floor After orienting the mesh to have a horizontal floor, we find the altitude of the floor in the y direction: we take the centroid of each mesh triangle whose facing direction is within a small angle of the positive y axis. We create a histogram of the y coordinates of these centroids, with each bucket representing a vertical range, such as 2 inches. We consider adjacent pairs of buckets and look for the pair with the highest number of points, such as (0, 1), (1, 2), etc. For a single-story building, we search for two large bucket pairs representing the floor (near the bottom) and the ceiling (near the top). If the building has sunken floors or raised ceilings, the histogram will show spikes at similar but not identical altitudes. To ensure that we locate true ceilings and floors, we search for a gap of several feet (such as the expected floor-to-ceiling height of a room) between the low and high histogram spikes. The spikes below this gap are probably floors, and those above are probably ceilings. To generate the floor plan, we choose the highest of the floor levels and the lowest of the ceiling levels as the computed floor and ceiling levels, respectively. Pairing the buckets rather than taking them individually ensures that we do not overlook spikes in the histogram if the mesh triangles are distributed evenly across two adjacent buckets. Rotate mesh and associated annotations Our next goal is to align the mesh model’s primary wall directions with the axes of Euclidean coordinates. One optional step is to eliminate mesh triangles whose surface normals are within a small angle from the positive or negative y directions, as these are probably ceiling or floor triangles. This step is not mandatory, but decreases the number of triangles to be processed. Additionally, we eliminate all triangles below the computed floor altitude and all above the computed ceiling altitude. We then examine the surface normals of the remaining triangles. We express each normal in spherical coordinates and use spherical k-means clustering to identify the dominant wall directions. Assuming the building has mainly perpendicular walls, there will be four primary wall directions, so we can set k = 4 for k-means clustering. If the model still has floor and ceiling triangles, we can set k = 6 to account for the two additional primary directions. Figure 3-left illustrates a heat map of surface normal directions in spherical coordinates from an office building mesh. Figure 3-left contains many light blue rectangles that are far from any cluster center (e.g., far from the buckets that are red, orange, and white). These represent triangles whose facing directions do not line up with any of the primary walls, floors, or ceilings. Such triangles exist for two reasons: (1) Building interiors contain many objects that are not walls, floors, or ceilings, such as furniture, documents, office equipment, artwork, etc. These objects may be placed at any angle. (2) The AR headset generates triangles that bridge across multiple surfaces (e.g., that touch multiple walls) and hence point in an intermediate direction. To compensate, we use a modified version of spherical coordinates k-means clustering that ignores triangle directions that are outliers as follows: After computing spherical k-means in the usual way, we look for all triangles in each cluster whose facing direction is more than a threshold θ1 from the cluster center. We discard all such triangles. Then we run k-means again, computing updated cluster centers. Then we discard all triangles that are more than θ2 from each cluster center where θ2 < θ1. We repeat this process several times until we achieve the desired accuracy. For example, in our current implementation, we use this sequence of angles θi in degrees: [50, 40, 30, 20, 10, 5, 3]. Once our modified k-means algorithm completes, we have 4 (or 6) cluster centers. Figure 3-right shows sample results for k = 6. Once the primary wall directions are computed, we pick the cluster with the largest number of triangles, take its direction (cluster center), project that direction onto the x−z plane, and call it θwall. We rotate the mesh by the angle between θwall and the x axis. Now the primary walls will be pointing along the x axis. Figure 4-left shows a building that is not aligned with the axes. Figure 4- right shows the same building after wall rotation. Adding in the x axis (in red) and z axis (in blue), we see that the walls are now well-aligned with the axes. Dividing a mesh into separate levels The HoloLens can digitize multi-story buildings. Given a multi-story model, we can compute a floor plan for each story. The process is similar to the one used in 3.1 to find the height of the floor. First, our system computes a histogram, as shown in figure 5 and segments the building into multiple levels, as shown in figure 5-middle and right. 3.2 Floor plan computation Our floor plan computation depends on the type of floor plan desired and whether the mesh is oriented with respect to the global axes. If we desire a pen-and-ink style floor plan and the mesh is oriented, we can simply pass the mesh M to the ComputeAndSuperimposeSlices() function, as in line 14 of Algorithm 3.2. However, if the mesh is not properly oriented, we align it with the global axes before computing the floor plan. If a drafting-style floor plan is desired, we utilize lines 2-13 of Algorithm 3.2 to compute flat walls. Computing flat walls To generate a drafting-style floor plan, we compute flat walls and separate them from other building contents using these steps: DBSCAN For each wall direction, we perform a modified DBSCAN algorithm: We compute the centroid C of each triangle ∆i . For each centroid point Ci during DBSCAN, we count the number of other centroid points that are near enough to be considered neighbors. However, instead of looking for neighbors in a sphere around each point as for traditional DBSCAN in 3D, we look for neighbors in a rectangular block of length l, width w and height h centered on the point. This block is tall enough to reach from floor to ceiling in the y direction, a little less wide than a door in the direction parallel to the proposed wall (e.g., 1.5 feet), and a few inches in the wall direction (to allow for walls that deviate slightly from being perfectly flat). Relying on the National Building Code, the wall’s minimum height is set at 8 feet, with a thickness of 8 inches. After DBSCAN, the mesh triangles are grouped into wall segments W S. Filtering We discard wall segments that are not good candidates, such as walls that are too small, that aren’t near the floor, or that aren’t near the ceiling. Plane fitting For each wall segment, we find a plane that has the same facing direction as the wall direction and that is a good fit to the triangle centroids in that wall segment. Given that the points are tightly collected in this direction, simply having the plane go through any centroid works surprisingly well. However, it is also possible to choose a point more carefully, such as by finding a point at a median position in the wall direction. Rectangle construction For each remaining wall segment, we construct rectangles R that lie in the fitted plane and are as wide as the wall segment triangles in width and as tall as the wall segment triangles in height. Mesh replacement For floor plan construction, we discard the original mesh triangles and replace them with the new planar wall rectangles to serve as a de-cluttered mesh. If the subsequent steps use libraries that expect a triangle mesh, we use two adjacent right triangles in place of each rectangle. for each i such that 0 ≤ i ≤ n. For each yi , we compute the intersection of the mesh with the plane y = yi . We end up with a stack of slices (see Figure 7). We can use the same method to produce a drafting-style floor plan. In this case, we begin with the flat wall model instead of the full mesh. This model has fewer details to capture by slicing at multiple altitudes, so we may choose to slice at a single intermediate altitude. Drawing the floor plan For either style of floor plan, we can project the slices to a plane by ignoring the y coordinates of the resulting line segments and plotting the resulting (x, z) coordinates as a two-dimensional image. We have also found it informative and aesthetically pleasing to draw the lines of each slice in a partially-transparent color, so that features that occur at multiple altitudes appear darker than features that occur only at a single altitude. As an example, Figure 8-left shows a mesh gathered from a commercial building. Figure 8-center shows the result of our DBSCAN on that data. Figure 8-right shows the flat walls that result after mesh replacement. Figure 9-right shows the drafting-style floor plan that results from slicing the flat walls. Figure 9-left shows the pen-and-ink floor plan made by slicing the oriented mesh at multiple altitudes. Drawing synthetic objects Because our data comes from an AR headset, we can add synthetic objects to mark the positions of objects in a room, such as sensors and windows. We can display these objects in our 3D models and floor plans. For the steps of mesh processing described above, we note the geometric transformations applied to the mesh and apply the same transformations to the synthetic objects, which then appear in the correct places in the 3D views and in the floor plans. For example, the black objects in Figure 8-left represent the objects placed by the user to show the positions of sensors and windows. Likewise, the red objects in the floor plan of Figure 9-left are those same objects, projected onto the same plane as the mesh slices. This paper is available on arxiv under CC BY-NC-ND 4.0 DEED license. Authors: (1) Ritesh Sharma, University Of California, Merced, USA rsharma39@ucmerced.edu; (2) Eric Bier, Palo Alto Research Center, USA bier@parc.com; (3) Lester Nelson, Palo Alto Research Center, USA lnelson@parc.com; (4) Mahabir Bhandari, Oak Ridge National Laboratory, USA bhandarims@ornl.gov; (5) Niraj Kunwar, Oak Ridge National Laboratory, USA kunwarn1@ornl.gov. Authors: Authors: (1) Ritesh Sharma, University Of California, Merced, USA rsharma39@ucmerced.edu; (2) Eric Bier, Palo Alto Research Center, USA bier@parc.com; (3) Lester Nelson, Palo Alto Research Center, USA lnelson@parc.com; (4) Mahabir Bhandari, Oak Ridge National Laboratory, USA bhandarims@ornl.gov; (5) Niraj Kunwar, Oak Ridge National Laboratory, USA kunwarn1@ornl.gov. Table of Links Abstract and Intro Abstract and Intro Related Work Related Work Methodology Methodology Experiments Experiments Conclusion & Future work and References Conclusion & Future work and References 3 Methodology We compute floor plans in four main steps (see Algorithm 1). First, a user captures the interior of the building as a triangle mesh using an augmented reality headset. The mesh is oriented to align with primary axes, and the building is divided into stories. Floors and ceilings are removed, and flat walls are detected if desired. Finally, one of two floor plan styles is generated by slicing and projecting the resulting 3D model. Next we describe these steps in detail. Data collection Indoor environments can be captured in various formats using different devices. We use a Microsoft HoloLens 2 headset to capture triangle mesh data, annotating the mesh using voice commands. Data collection Capturing the triangle mesh The HoloLens provides hardware and software to create a 3D representation of the indoor environment using triangles, as shown in Figure 1-left. The headset overlays the triangles on the user’s view of the building interior. Although the headset captures most of the walls, floors, and ceilings, data may be missing from some regions, as in the figure. Capturing the triangle mesh Annotating the mesh To capture object positions such as sensors, thermostats, windows, and doors, we developed an augmented reality (AR) user interface. This interface uses eye gaze detection and voice commands to enable users to place synthetic objects at desired locations, as shown in Figure 1-right where a synthetic sensor object is added to the immersive environment, superimposed on a physical sensor. Annotating the mesh 3.1 Mesh orientation After the annotated triangle mesh is captured, geometric processing is performed. Initially, the mesh’s orientation is based on the user’s starting position and gaze direction. To generate a floor plan, we must determine the floor’s position and facing direction. The AR headset provides a rough estimation of gravity direction, but additional computation improves precision. Orienting the floor To determine the mesh orientation, we tested two methods: (1) compute the shortest edges of the mesh bounding box, and (2) cluster the facing directions of mesh triangles using spherical k-means. Method (1) works for buildings with constant altitude and large floor area, but it fails on others, so we mainly use Method (2), described in Algorithm 2. Orienting the floor Algorithm 2 applies to a broad range of meshes, including multi-story buildings with vertical dominance. It uses the surface normal vector of each triangle ∆ in the mesh M and filters out triangles deviating significantly from the positive y direction, preserving those likely to represent the floor (∆ ′). We use a spherical coordinates k-means algorithm with k = 1 to find the dominant direction gm of these triangles. We discard triangles that are more han an angle ϕ from the dominant direction and repeat the k-means algorithm until ϕmin is reached (e.g., start with ϕ = 30 degrees and end with ϕmin = 3). This gives an estimate of the true gravity direction gt. To orient the mesh, we compute the angle θ between gt and the negative yaxis and determine the rotation axis Y by taking their cross product. We rotate the mesh by θ around the Y axis, ensuring a horizontal floor. Further details on this method for floor orientation are in Algorithm 2. Figure 2 shows a model where the floor is not level, but tilts down from near to far and from right to left. After Algorithm 2, the floor is horizontal. Finding the height of the floor After orienting the mesh to have a horizontal floor, we find the altitude of the floor in the y direction: we take the centroid of each mesh triangle whose facing direction is within a small angle of the positive y axis. We create a histogram of the y coordinates of these centroids, with each bucket representing a vertical range, such as 2 inches. We consider adjacent pairs of buckets and look for the pair with the highest number of points, such as (0, 1), (1, 2), etc. For a single-story building, we search for two large bucket pairs representing the floor (near the bottom) and the ceiling (near the top). Finding the height of the floor If the building has sunken floors or raised ceilings, the histogram will show spikes at similar but not identical altitudes. To ensure that we locate true ceilings and floors, we search for a gap of several feet (such as the expected floor-to-ceiling height of a room) between the low and high histogram spikes. The spikes below this gap are probably floors, and those above are probably ceilings. To generate the floor plan, we choose the highest of the floor levels and the lowest of the ceiling levels as the computed floor and ceiling levels, respectively. Pairing the buckets rather than taking them individually ensures that we do not overlook spikes in the histogram if the mesh triangles are distributed evenly across two adjacent buckets. Rotate mesh and associated annotations Our next goal is to align the mesh model’s primary wall directions with the axes of Euclidean coordinates. Rotate mesh and associated annotations One optional step is to eliminate mesh triangles whose surface normals are within a small angle from the positive or negative y directions, as these are probably ceiling or floor triangles. This step is not mandatory, but decreases the number of triangles to be processed. Additionally, we eliminate all triangles below the computed floor altitude and all above the computed ceiling altitude. We then examine the surface normals of the remaining triangles. We express each normal in spherical coordinates and use spherical k-means clustering to identify the dominant wall directions. Assuming the building has mainly perpendicular walls, there will be four primary wall directions, so we can set k = 4 for k-means clustering. If the model still has floor and ceiling triangles, we can set k = 6 to account for the two additional primary directions. Figure 3-left illustrates a heat map of surface normal directions in spherical coordinates from an office building mesh. Figure 3-left contains many light blue rectangles that are far from any cluster center (e.g., far from the buckets that are red, orange, and white). These represent triangles whose facing directions do not line up with any of the primary walls, floors, or ceilings. Such triangles exist for two reasons: (1) Building interiors contain many objects that are not walls, floors, or ceilings, such as furniture, documents, office equipment, artwork, etc. These objects may be placed at any angle. (2) The AR headset generates triangles that bridge across multiple surfaces (e.g., that touch multiple walls) and hence point in an intermediate direction. To compensate, we use a modified version of spherical coordinates k-means clustering that ignores triangle directions that are outliers as follows: After computing spherical k-means in the usual way, we look for all triangles in each cluster whose facing direction is more than a threshold θ1 from the cluster center. We discard all such triangles. Then we run k-means again, computing updated cluster centers. Then we discard all triangles that are more than θ2 from each cluster center where θ2 < θ1. We repeat this process several times until we achieve the desired accuracy. For example, in our current implementation, we use this sequence of angles θi in degrees: [50, 40, 30, 20, 10, 5, 3]. Once our modified k-means algorithm completes, we have 4 (or 6) cluster centers. Figure 3-right shows sample results for k = 6. Once the primary wall directions are computed, we pick the cluster with the largest number of triangles, take its direction (cluster center), project that direction onto the x−z plane, and call it θwall. We rotate the mesh by the angle between θwall and the x axis. Now the primary walls will be pointing along the x axis. Figure 4-left shows a building that is not aligned with the axes. Figure 4- right shows the same building after wall rotation. Adding in the x axis (in red) and z axis (in blue), we see that the walls are now well-aligned with the axes. Dividing a mesh into separate levels The HoloLens can digitize multi-story buildings. Given a multi-story model, we can compute a floor plan for each story. The process is similar to the one used in 3.1 to find the height of the floor. First, our system computes a histogram, as shown in figure 5 and segments the building into multiple levels, as shown in figure 5-middle and right. Dividing a mesh into separate levels 3.2 Floor plan computation Our floor plan computation depends on the type of floor plan desired and whether the mesh is oriented with respect to the global axes. If we desire a pen-and-ink style floor plan and the mesh is oriented, we can simply pass the mesh M to the ComputeAndSuperimposeSlices() function, as in line 14 of Algorithm 3.2. However, if the mesh is not properly oriented, we align it with the global axes before computing the floor plan. If a drafting-style floor plan is desired, we utilize lines 2-13 of Algorithm 3.2 to compute flat walls. Computing flat walls To generate a drafting-style floor plan, we compute flat walls and separate them from other building contents using these steps: Computing flat walls DBSCAN For each wall direction, we perform a modified DBSCAN algorithm: We compute the centroid C of each triangle ∆i . For each centroid point Ci during DBSCAN, we count the number of other centroid points that are near enough to be considered neighbors. However, instead of looking for neighbors in a sphere around each point as for traditional DBSCAN in 3D, we look for neighbors in a rectangular block of length l, width w and height h centered on the point. This block is tall enough to reach from floor to ceiling in the y direction, a little less wide than a door in the direction parallel to the proposed wall (e.g., 1.5 feet), and a few inches in the wall direction (to allow for walls that deviate slightly from being perfectly flat). Relying on the National Building Code, the wall’s minimum height is set at 8 feet, with a thickness of 8 inches. After DBSCAN, the mesh triangles are grouped into wall segments W S. Filtering We discard wall segments that are not good candidates, such as walls that are too small, that aren’t near the floor, or that aren’t near the ceiling. Filtering Plane fitting For each wall segment, we find a plane that has the same facing direction as the wall direction and that is a good fit to the triangle centroids in that wall segment. Given that the points are tightly collected in this direction, simply having the plane go through any centroid works surprisingly well. However, it is also possible to choose a point more carefully, such as by finding a point at a median position in the wall direction. Plane fitting Rectangle construction For each remaining wall segment, we construct rectangles R that lie in the fitted plane and are as wide as the wall segment triangles in width and as tall as the wall segment triangles in height. Rectangle construction Mesh replacement For floor plan construction, we discard the original mesh triangles and replace them with the new planar wall rectangles to serve as a de-cluttered mesh. If the subsequent steps use libraries that expect a triangle mesh, we use two adjacent right triangles in place of each rectangle. Mesh replacement for each i such that 0 ≤ i ≤ n. For each yi , we compute the intersection of the mesh with the plane y = yi . We end up with a stack of slices (see Figure 7). We can use the same method to produce a drafting-style floor plan. In this case, we begin with the flat wall model instead of the full mesh. This model has fewer details to capture by slicing at multiple altitudes, so we may choose to slice at a single intermediate altitude. Drawing the floor plan For either style of floor plan, we can project the slices to a plane by ignoring the y coordinates of the resulting line segments and plotting the resulting (x, z) coordinates as a two-dimensional image. We have also found it informative and aesthetically pleasing to draw the lines of each slice in a partially-transparent color, so that features that occur at multiple altitudes appear darker than features that occur only at a single altitude. Drawing the floor plan As an example, Figure 8-left shows a mesh gathered from a commercial building. Figure 8-center shows the result of our DBSCAN on that data. Figure 8-right shows the flat walls that result after mesh replacement. Figure 9-right shows the drafting-style floor plan that results from slicing the flat walls. Figure 9-left shows the pen-and-ink floor plan made by slicing the oriented mesh at multiple altitudes. Drawing synthetic objects Because our data comes from an AR headset, we can add synthetic objects to mark the positions of objects in a room, such as sensors and windows. We can display these objects in our 3D models and floor plans. For the steps of mesh processing described above, we note the geometric transformations applied to the mesh and apply the same transformations to the synthetic objects, which then appear in the correct places in the 3D views and in the floor plans. For example, the black objects in Figure 8-left represent the objects placed by the user to show the positions of sensors and windows. Likewise, the red objects in the floor plan of Figure 9-left are those same objects, projected onto the same plane as the mesh slices. Drawing synthetic objects This paper is available on arxiv under CC BY-NC-ND 4.0 DEED license. This paper is available on arxiv under CC BY-NC-ND 4.0 DEED license. available on arxiv