1 Introduction

High-resolution 3D meshes have become ubiquitous in our world, which usually require huge storage and transmission cost when casually represented. Therefore, a great many techniques have been introduced to compress 3D meshes. Most of these techniques are best oriented for meshes with a high degree of regularity.Footnote 1 However, plenty of models produced by 3D scanning devices or human designers have significant amount of irregularity in both topology and geometry. Examples of a regular model (the apple model) and an irregular model (the hand model) are shown in Fig. 1, where we see a more random distribution of vertex valences and positions in the hand model. It is usually challenging to compress such irregular 3D meshes due to the difficulty in designing an effective geometry prediction scheme.

Fig. 1
figure 1

Closeup of a regular mesh (the apple model) and an irregular one (the hand model). The two meshes differ in the uniformity of vertex valences and vertex distribution

Octree-decomposition-based progressive compression [3, 9, 10, 19, 22], which results in a compact representation of the object space, has been one of the most efficient coding strategies for 3D models till now. Each level of the octree corresponds to an intermediate approximation of the input mesh. The approximation gets more and more accurate with the iterative subdivision proceeding. The information it conveys at one level can be utilized to predict the geometry of the next level.

Concerning the compression of irregular meshes, however, there are limitations associated with the previous works [10, 19, 22]. As shown in Fig. 2, the statistical distribution of occupancy codes (occupancy code is an 8-bit byte that is used for an octree cell subdivision to indicate either each child cell is occupied or not) usually varies across the octree levels. In most previous compression schemes, to the best of our knowledge, the frequency table of each entropy coder is uniformly initialized without explicit utilization of the fact that the characteristics of the occupancy-code distributions is significantly different for different levels, which leads to a waste of bit rates. In addition, previous octree-decomposition-based mesh coding techniques are based on the assumption that non-empty child cells tend to be close to the centroid of the current neighboring cells which, however, is frequently not the case with irregular models.

Fig. 2
figure 2

The distribution of occupancy codes for two different levels of the octree. Level b has a more concentrated distribution than level a

In this paper, based on octree decomposition, we propose an adaptive-coding method to efficiently compress both regular and irregular meshes. According to the geometry property of each level of the octree, an entropy codec, whose probability model is initialized based on the geometry property and results in quick convergence during the probability updating, is employed for each level to improve the compression ratio. The probability model does not have to be written into the bitstream. Moreover, a novel scheme based on a smoothness measure is presented to predict the occupied octree cells, which mitigates the effect of non-uniform vertex sampling. With this scheme, the distribution of the occupancy codes is concentrated, which makes octree decomposition suitable for compression of both regular and irregular meshes. Since the geometry data dominate the compressed file size in most cases, we focus on geometry coding in this work. The experimental results demonstrate the improved performance of the proposed method on both scanned meshes with high regularity and irregular man-made meshes. Our method could gain as much as 7.41 % in coding bit rate compared to the state-of-the-art [19].

2 Related work

Recently, numerous methods have been developed to compress 3D meshes and point-sampled geometry, respectively. These proposed methods can be categorized into single-rate and progressive techniques. For a more thorough survey of 3D mesh compression, we refer to [1, 18]. Most of the early works focus on single-rate compression such as [2, 23, 24]. Nowadays, progressive compression has been attracting much attention as it is more suitable for streaming transmission in network, which can achieve reconstruction in different levels of detail (LODs).

Progressive mesh compression includes connectivity-centric and geometry-centric algorithms. Considering the geometry-centric algorithms, there are octree-based codec [19], feature oriented codec [17], kD-tree codec [5], spectral coding geometry codec [12], wavelet-based mesh codec [13], etc. In [19], Peng et al. propose an progressive algorithm to encode whether each cell is empty or not during geometry coding, which shows that an octree decomposition can achieve good compression ratios. Especially for highly regular geometry data, the range of improvement can be very large. Motivated by the performance of octree-based methods, we also employ octree decomposition as a preliminary process for compressing irregular geometry. In [17], Peng et al. propose a feature oriented progressive lossless mesh coding, where a sequence of LODs are generated through iterative vertex set split and bounding volume subdivision. In any intermediate mesh, their method can preserve geometric features and minimize surface distortion.

For point-sampled geometry, the compression algorithms have also been studied intensively. In [7], a single-rate coder is proposed, where a prediction tree is built for the input model to facilitate prediction and entropy coding. Progressive coders have been proposed in [3, 6, 911, 15, 20, 22, 25, 27]. Among these previous methods, [3, 9, 10, 22] are based on octree decomposition, which are related to the approach presented in this paper. In [3], the position data are encoded through the coding of byte codes associated with octree cell subdivisions. Schnable et al. [22] propose a progressive compression method for densely sampled point geometry, which encodes the number of non-empty child cells and the index of child-cell configuration for each octree cell subdivision. The additional point attributes, such as color, can also be encoded in this framework. Due to the computational complexity, the method cannot be generally applicable to real-time decoding. In [9, 10], Huang et al. propose a generic point cloud encoder that compresses different attributes of point samples, such as positions, normals and colors. During an iterative octree cell subdivision of the object space, the positions of point samples are approximated by the geometry centers of all tree-front cells at each level of subdivision.

The rest of this paper is organized as follows. Section 3 reviews the general concept of octree-based geometry compression. Section 4 describes the proposed adaptive coding for occupancy codes and the novel prediction method based on smoothness maximization. Section 5 experimentally investigates the compression of 3D irregular meshes in comparison with the state-of-the-art. Section 6 summarizes this paper.

3 Octree compression

The octree decomposition is used originally in [14, 21] for compression of volume data. As the octree-based compression methods perform well on regular 3D meshes or point clouds, we employ it to construct the compact representation of an irregular mesh. In this section we briefly review the steps of compression based on octree decomposition.

Given the bounding box of a mesh M that is to be compressed, an octree O is constructed with a maximum number of levels L and the points in M are sorted into the cells of the octree. The points in M are replaced by the cell centers of the leaves, i.e. the points in M are quantized. The number of levels determines the precision of the coordinate quantization. The octree coding proceeds in a top-down and breadth-first fashion. The root cell can always be assumed to be non-empty. Then, each cell subdivision can be encoded by specifying which of its child cells are non-empty. The decoder can then faithfully recover the octree by following the same traversal rules as the encoder while constructing those child cells that have been specified as occupied by the encoder.

As shown in Fig. 3, an octree is built up by recursively subdividing the bounding box of 3D models. Every node in the octree represents a 3D cell. Generally, each node is associated with an 8-bit-binary (4-bit-binary for 2D cases) code named occupancy code, which indicates the non-emptiness of its children (“1” for non-emptiness and “0” for emptiness), given a counterclockwise traversal order illustrated in Fig. 3.

Fig. 3
figure 3

Occupancy code. Given a counterclockwise traversal order, we obtain the occupancy codes in a 4-bit-binary way

If the configurations of occupied child cells can be well predicted, high compression ratios can be achieved. Basically, the non-emptiness of the child cells is randomly distributed. As seen in Fig. 4(a), occupancy code is close to be uniformly distributed if counterclockwise traversal is performed. Some prior arts propose to traverse the child cells based on the probability of being non-empty. The child cell that is most likely to be non-empty will be visited first, whereas the one that is most likely to be empty will be visited last. Therefore, as shown in Fig. 4(b), the statistical distribution of the occupancy codes becomes more concentrated, which is beneficial for efficient entropy compression.

Fig. 4
figure 4

Fixed-order traversal (a) and reordered traversal (b). The statistical distribution of the occupancy codes is close to be uniform for fixed-order traversal, and more concentrated for reordered traversal

4 Adaptive coding with prediction

As the octree O is traversed in breadth-first order, the centers of cells on the traversal front consequently provide an approximation of the initial input model. Since this approximation is accessible to both encoder and decoder, we utilize it to predict the child-cell configurations during cell subdividing. Thus, compression of an octree can be implemented through encoding the occupancy codes.

In this paper, adaptive arithmetic coding (AAC) [26] is improved to encode each level of octree. Prior knowledge of the occupancy codes based on intrinsic geometrical property is employed to update the one-vertex-per-cell probability, which plays a key role in AAC. For both regular and irregular meshes, the local surface around a point possesses smoothness property. Prediction of occupied cells based on a smoothness measure will often be preferable to that based on a regularity measure in case of compressing man-made models.

4.1 Non-emptiness prediction based on smoothness maximization

The prediction of occupied cells is a primary step in octree-based compression methods, which usually employs different geometric constraints with regularity assumption. It is noteworthy that in [9, 10] the Euclidean distances between separate child-cell’ centers to the local tangent plane are used to prioritize the bits in the corresponding occupancy code. Here we use the smoothness measure approach as described below instead. Ranking child cells based on area mitigates the effect of non-uniform sampling since more sparsely sampled regions will be weighted with larger triangle areas, which otherwise would be under-represented using Euclidean distance based prioritization.

In general, 3D models depict geometric shapes like convex and concave shapes, in whose intermediate representations, such as octree, kD tree, etc., the local surface around a point lies on one side of its local tangent plane except some shapes like a monkey saddle. Furthermore, this observation is not restricted by the representation of 3D data, which can be exploited to handle irregular meshes.

Take two kinds of surface for example. The equivalent case in 2D space is shown in Fig. 5. In Fig. 5(a), the points lie on one side of the local tangent, whereas, in Fig. 5(b), they lie on both sides of the local tangent. Considering the aforementioned property, the local tangent plane T through the center of a parent cell implies a way to assign priority to the traversal of its child cells. Firstly, as Huang et al. [9, 10], we use a plane-side based strategy to initially assign the priority. On either side of the tangent plane T through a parent cell, the distances from its neighbor cells to T are accumulated. Thus, the child cells lying on the side of T with a larger accumulated distance are assigned higher probability. However, as far as finer levels are concerned, most neighbor cells lie on one side of the tangent plane. Therefore, it is not necessary to use the initial priority assignment.

Fig. 5
figure 5

The 2D curve in (a) is equivalent to a convex or concave surface in 3D, and the one in (b) is corresponding to a monkey saddle in 3D. In (a), the points lie on one side of the local tangent, whereas, in (b), they lie on both sides of the local tangent

To decide whether or not to use this strategy, we have to differentiate coarser and finer levels. For that purpose, we propose to use the total Gaussian curvature, which is defined as the integral of the Gaussian curvature over a mesh ∫∫KdA. Geometrically, every point q on the surface maps to the point on the unit sphere that represents the unit normal N(q). This is called the Gauss map as shown in Fig. 6. All the image points constitute a shaded region R on the unit sphere, whose algebraic area is the total Gaussian curvature of the surface [16, p. 290].

Fig. 6
figure 6

Gauss map from a surface to a unit sphere

Since the mesh may be “folded” many times over a small region on the unit sphere to still yield a small total Gaussian curvature, we use the absolute value of K:

$$\int\int|K|\,dA = \sum_{i=1\ldots N_{\mathrm{tri}}} |K_{i}|A_{i},$$
(1)

where N tri is the number of the triangles of the mesh, and A i is the area of the ith triangle of the mesh. Its value in finer levels is smaller than that in coarser levels.Footnote 2 When its value is smaller than a threshold, there is no need to use plane-side based priority assignment strategy.

In order to prioritize the child cells on the same side of the plane, pure Euclidean distance based sorting is used in [9, 10]. Differently, we propose to use a smoothness criterion to determine the order of traversal of child cells, in an effort to mitigate the effect of uneven sampling that may exist. In the real world, most objects have smooth surfaces. From this perspective, in octree-decomposition-based representation, points contained in a cell tend to locate closely to the local tangent plane to maximize smoothness. It would further make the observation that the closer a child is to the tangent plane, the smaller the surface area of the convex hull which is formed by current child-cell’s centroid and centroids of the parent neighbors. To illustrate the idea described here, let us consider an example in 2D space as shown in Fig. 7. The parent representative is shown as o and its neighbors are given by points p and q. The children of o are represented by points a, b, c, and d. In 2D case, the aforementioned numerical measure, i.e. surface area of the convex hull, will degenerate into perimeter of the convex hull. Child cell a is assigned a higher probability value to be non-empty than c since d ap +d aq +d pq <d cp +d cq +d pq . The same reasoning can be used for pairs (a, b) and (a, d). Consequently, in the final traversal order of child cells, cell a will be visited first.

Fig. 7
figure 7

Determination of the new child-cell traversal order based on tangent-plane continuity, where a, b, c, and d are child cells, with a non-empty (filled). In the final traversal order, cell a will be visited first

With this area oriented smoothness measure, we estimate the probability of child cells’ being non-empty with the following steps. Here, we assume that the connectivity between vertices in any intermediate mesh is already known (connectivity compression can be done by available technique from prior arts).

  1. 1.

    On either side of the local tangent plane T through the center of the parent cell, we sum up the distances of neighbor cells to T, and assign higher probability values to child cells whose centers are on the side of T with a higher sum of distances.

  2. 2.

    For child cells with centers on the same side of T, higher probability values are assigned to those whose value of the smoothness measure is smaller.

  3. 3.

    When the value of the total absolute Gaussian curvature of an intermediate mesh is smaller than a threshold, all the child cells of a cell will be assigned probability of non-empty according to their values of the smoothness measure.

4.2 Adaptive coding of occupancy codes

The occupancy codes are normally compressed with AAC. The compression efficiency benefits from the probability update during the coding process. AAC works pretty well in the cases that the statistical distribution of the code-values is stable. However, the statistical distribution of the occupancy codes of an octree usually varies significantly across levels. It takes a period of time for AAC to model the statistics of one level, which is relatively stable. But when it comes to another level, AAC has to take another period of time to model the new statistics. Therefore, the statistic modeling lags behind the actual probability. Consequently, the compression suffers from the fact that the modeled statistics for codeword assignment differs from the exact statistical distribution of the current level. If some prior knowledge of the occupancy code is utilized for the probability update, a nice compression gain can be expected.

Here is our observation. The diagonal length of a cell is smaller than the average edge length of the approximating intermediate mesh when the subdivision reaches a certain level. Thus, most cells at this level contain only one point, and the corresponding occupancy code has only one “1”, so do their children.

Figure 8 demonstrates the case. The single–“1” codes occur more and more frequently as the level goes deeper, which means a probability change in the code statistics takes place across levels. A considerable compression gain may be obtained if we can nicely predict the change.

Fig. 8
figure 8

The percentage of single–“1” codes of each level

Denote by R l the ratio between the diagonal length of a child cell and the average edge length of the current level, and by P 1 the percentage of single–“1” of the next level. With the subdivision going on, the average edge length of the approximating intermediate mesh gets closer to that of the input mesh. At the same time, the diagonal length of a cell is gets smaller by a factor of \(\frac{1}{2}\) after each subdivision. This means that there is an implicit one-to-one correspondence between R l and P 1. Consequently, R l can be considered as a predictor of P 1.

We select a couple of 3D models from Princeton Shape BenchmarkFootnote 3 to model the general mapping relationship between R l and P 1, as shown in Fig. 9, the fitting function is given as follows:

$$P_{1}=\frac{0.90}{1+ e^{150.49R_{l}-110.42}}+0.10. $$
(2)
Fig. 9
figure 9

The mapping relationship between P 1 and R l

Equation (2) is utilized to estimate the statistical probability of a deeper level. Then the probability model trained by the codec can converge more quickly. The process of the probability update is shown in Algorithm 1. The probability is initially set as uniform distribution at the top level. The codec executes traditional AAC until it comes to a level where the percentage of the single–“1” codes is large enough (necessary to do probability estimation). From this particular level to the bottom of the octree, the percentage of the single–“1” codes is estimated by Eq. (2) at the beginning of each level and the exact probability for each particular code is rescaled accordingly. Thus, the initial probability model for each level is determined on the fly rather than written into the bitstream.

Algorithm 1
figure 10

Probability update

4.3 Prioritized traversal

Although a standard breadth-first traversal of octree already yields good results, the performance of compression can still be improved by reordering the traversal. Earlier processing of those cells that introduce the greatest error will not only lead to a faster increase of the signal to noise ratio during progressive decompression, but also improve the overall compression ratio. This is due to the fact that the prediction becomes more accurate for cells processed later. Similar to Peng and Kuo [19], we use a prioritized traversal according to the importance of each cell. Differently, we use the cell valence as the metric of cell importance, which is more direct and simple when compared with the cell importance metric in [19]. A higher cell valence means higher importance. Then we subdivide more important cells earlier to provide better rate-distortion performance.

5 Experimental results

To evaluate the performance of our method on both regular and irregular meshes, we compare with the method (PK05) in [19], which is the state-of-the-art technique for compression by an octree on various models. As usual, compression performance is measured by bits per vertex (bpv) and the loss of quality by peak signal to noise ratio (PSNR). In our experiments, we use models which are shown in Fig. 10. All mesh vertices are quantized with 12 bits per coordinate. All the irregular models (i.e., the meshes ‘m323’, ‘m1041’, ‘m1048’, and ‘m1085’) are obtained from the Princeton Shape Benchmark. The Bunny model is from the Stanford 3D Scanning Repository.Footnote 4

Fig. 10
figure 11

Models used in our experiments. Irregular meshes ‘m323’, ‘m1041’, ‘m1048’, and ‘m1085’ are obtained from the Princeton Shape Benchmark. The regular Bunny model is from the Stanford 3D Scanning Repository

Experimental results on bit rates (in the unit of bpv) are listed in Table 1, where the mesh name and the number of vertices in each mesh are listed in the first two columns. Then, we compare the coding bit rates for two algorithms, namely, PK05 and the mesh codec proposed in this paper. For each algorithm, we report the geometry coding costs in the third and fourth columns. Among the geometry bit rates for PK05, the one marked with ‘*’ is taken from the original paper, while others are obtained through our implementation of their octree geometry codec. The geometry coding gains are listed in the fifth column.

Table 1 Bit rates (in bpv) for geometry coding. The fifth column lists the geometry coding gain, where \(\mathit{GG}=\frac{\mathit{PK}05 - \mathit{ours}}{PK05} \times100~\%\). The geometry bit rate marked with ‘*’ is taken from [19]

As observed from Table 1, our proposed codec improves the compression ratio by about 5 % on the average. Denote an irregularity measure by IM as shown in Eq. (3),

$$\mathit{IM}=\sqrt{\mathit{Var}(S_{\mathrm{tri}})}/\mathit{Ave}(S_{\mathrm{tri}}), $$
(3)

where Var(S tri) is the variance of the mesh’s triangle area, and Ave(S tri) is the average value of the mesh’s triangle area. The value of IM for the Bunny model is 0.23, whereas that for ‘m1048’ is 1.22, which means that ‘m1048’ is more irregular than Bunny.

For the Bunny model, we are able to achieve a bpv of about 12.71 based on the 12-bit coordinate quantization, whereas PK05’s method requires about 13.20 bpv, which means that our method can handle the model with regularity even better. In case of the ‘m1048’ model, our method gains about 7.41 % compared to PK05’s. This is due to the fact that in the case of ‘m1048’ less regularly sampled mesh geometry is compressed, and PK05’s method heavily depends on the regularity assumption.

Another observation from Table 1 is that the geometry coding gain of ‘m323’ is slightly lower than that of the Bunny model. This is due to the complex shape of ‘m323’, the local surface between fingers may not be as smooth as the Bunny model is. Consequently, our method gains about 2.57 % on ‘m323’ compared with 3.71 % on the Bunny.

To compare the rate-distortion performance, we plot the PSNR-BPV curve for these meshes using the PK05’s geometry codec and the proposed codec in Table 2. The PSNR of any intermediate mesh is calculated using the Euclidean distance between corresponding points in the original and the reconstructed models with the peak signal given by the diagonal length of the bounding box of the original model. Compared with the results of PK05’s, in case of the ‘m1048’ model, we are able to achieve a PSNR of 86.60 with only 15.24 bpv, whereas their method requires almost 16.46 bpv to achieve the same PSNR.

Table 2 Rate-distortion performance comparison between our method and the method proposed in PK05 [19]. Clearly, ours can obtain higher approximating results using same bit numbers

Some visual examples of progressive mesh reconstruction are given in Table 3. For each reconstructed mesh, its geometry bit rate is given in the table. As shown in the table, in case of the ‘m1048’ model, the reconstructed model at 6.75 bpv is almost indistinguishable from the original mesh.

Table 3 Reconstructed models at different BPVs

We also compare with the method in [17], which is a feature oriented progressive compression scheme. Their technique in geometry coding is particularly effective in preserving sharp features at the low bit rates. However, when the model is fully restored in a lossless way, our method has a better compression ratio.

6 Conclusion

In this paper, we introduce an adaptive-coding scheme for meshes with octree decomposition, which is especially useful for irregular meshes. During the subdivision of the object space, the child cells are traversed according to their probabilities of being non-empty, which are estimated using the proposed smoothness measurement. The resulting occupancy codes are encoded with a refined AAC, where the probability is initialized based on the geometry property at the beginning of each level. As demonstrated in the experiments, the proposed method considerably outperforms the prior arts in both regular and irregular meshes. The gains of compression ratios may go up to 7.41 % for irregular meshes.

In the future, coding of additional attributes (e.g., normal, color) in 3D models can be integrated into the proposed algorithm. Meanwhile, the proposed method can be generalized to encode point clouds, which is another popular representation of 3D data. The combination of our technique with the one in [4], another compression scheme based on octree, will enlarge the application scope to out-of-core models.