Keywords

1 Introduction

Identifying the identity of online user accounts is an important problem for many practical applications. Such informative elements are referred to as user identification. With the rapid development of the Internet, the problem of user identification can be the foundation of many practical applications, such as user migrations [8], Enhancing Friend Recommendation [6], Information Diffusion [19], Multiple Network Group Interaction [5] and Analyzing Network Dynamics [1].

Fig. 1.
figure 1

The basic idea of our approach. Given four accounts, accounts 1 and 4 are asserted to belong to the same individual A, since all images in accounts 1 and 4 are captured by the same camera I.

Identifying online accounts’ identity is a challenging task. The core problem is how to find the person unique features from different accounts. Many efforts have been devoted to this problem, and many user features have been proposed. For example, the public personal informations such as real-name, e-mail, and IP address are employed in identifying the identity of online accounts. Linguistic stylistics [12], writing style [16] can also be viewed as an efficient pattern which indicates the real identity. Furthermore, other information which is summarized by the user behaviors, such as membership with other accounts and the points of interest, are also taken as the basis of user identification in many previous studies. Above works show the efficient performance in user identification. However, these studies show several limitations. For example, the personal information can be efficiently deliberate hidden with a different purpose, the linguistic stylistics shows much confusion with a few text data, and the user behaviors cannot precisely identify a specific users’ identity.

In this paper, we attempt to identify users by matching their cameras. The underlying idea is that the accounts belong to the same person are usually post the same camera captured images, and the camera can be viewed as a person unique feature. As shown in Fig. 1, Account 1 and Account 4 are identified to User A since the images published in each account’s album are all captured by Camera I. Therefore, we convert the user identification problem to camera identification problem. In previous studies of the forensic community, identifying the source of an image without extra information (e.g. EXIF file or JPEG Header) is an important problem. Just like the fingerprint, there exists the unique camera feature which is caused by the imperfection of the camera sensor, such as dust on the sensor, or sensor pattern noise (SPN). As the most popular camera fingerprints, the SPN extraction technique is applied in this paper to extract the person unique feature from the user account.

Extracting the camera features from the online users’ albums is a challenging problem. According to the related technique, a fine camera fingerprint is the average of the residual noises of plenty same camera images. However, for a user album, the images may be taken from more than one camera device. Therefore, before we extract the camera fingerprint, we must know which images share with the same camera sources. Further, considering the user behavior of re-forwarding, the users that share the same camera fingerprints may not exist an excellent relationship in the offline world. Therefore, a mechanism that filters re-forwarding images is necessary.

In this paper, we propose a camera fingerprint-based user identification scheme to address user identification problem in image sharing platforms. To the best of our knowledge, it is the first work to tackle user identification from the forensic aspect. The main contributions can be summarized as follows:

  • A new perspective is proposed to tackle the user identification problem in image sharing platforms.

  • A camera feature extraction algorithm is proposed to extract the unique personal feature from the online account.

  • A reposted images filter mechanism is proposed to restrain the confusion of photo forwarding behavior.

2 Related Work

User identification has been studied in different perspectives, such as a friend relationship. It is essential for user identification to find out whether online accounts could be associated with the same person or not in the real world. The core problem is how to seek a unique and reliable pattern among the accounts’ public information.

One commonly adopted method is to extract personal information through users’ public profiles, such as username and register address [13]. Based on these person-unique patterns, some simple but effective algorithms have been designed, and excellent performance on some datasets has been achieved. However, many users, especially malicious users, do not release any private information on social networks, which makes these methods fail.

Different from the methods mentioned above, [7] regards the social network as a large graph, where the nodes and edges represent the users and their link relationships, respectively. Therefore, the user identification problem can be converted into an approximate graph isomorphism problem. Similar works also reported in [9, 17, 18]. The essence of these kinds of methods is that the users’ linking relationship is regarded as discrimination patterns. In recent years, plenty of similar works have been proposed and achieved good performance. However, such methods are not very suited to identify specific users, especially malicious users. Such methods may more suitable for identifying a user group.

To solve the above problem, many studies attempt to mine latent information from users’ activities in accounts. An impressive work [12] shows that the users can be identified based on linguistic stylistics, which is trained through the person’s text on the Internet. Furthermore, hobbies and interests patterns are also employed as user identification features. These methods can extract personal patterns through accounts’ public information, and excellent performance has been achieved. However, the identification result obtained by such features is more like the inference instead of evidence, which makes such methods not reliable.

To address the above-mentioned problems, we extracting camera fingerprint as the person unique pattern. The camera fingerprint is a kind of invisible and camera-unique component in digital images, and the most widespread approach is PRNU (Photo Response Non-Uniformity) [10]. Recently, a plenty of works based on PRNU has been proposed and applied in many practical application. For example, citeValsesia2015 applied the PRNU to retrieval same camera source images on Internet, [9] attempt to tackle the problem of camera source cluster, [4] employed the PRNU to solve the problem of image forgery detection. Furthermore, [3] enhanced the detail of PRNU and further improved the performance. Due to the high reliability of camera fingerprint, it can be taken as the evidence in court. Therefore, it can make up the drawback of traditional user identification methods. Furthermore, similarity studies also mentioned in [11, 14, 15].

In the application of social network, [2] verified the feasibility of PRNU on social media images. Specifically, the images on social media are compressed and downsampling, however, the PRNU can still be detected. To the problem of user identification, to the best of our knowledge, [1] is the first work to tackle the user identification problem based on camera source. They proposed a novel algorithm, i.e. picture-to-identity linking, to identify the users via camera identification. However, this approach is based on the pairwise matching among all users’ images, which means the time cost is very expensive, especially for the large scale dataset of online social network users.

3 Methodology

In this section, we propose a user feature extraction framework. The diagram is illustrated in Fig. 3. The details are given as follows.

3.1 Preliminaries

We first discuss how to extract camera fingerprint based on [10]. Given an image \(\mathbf I \), its residual noise \(\mathbf R \) can be extracted with

$$\begin{aligned} \mathbf R =\mathbf I -\mathcal {F}(\mathbf I ), \end{aligned}$$
(1)

where \(\mathcal {F}(\cdot )\) is a denoising filter. Then for a set of same camera source images \(\mathcal {I} = \{\mathbf {I}_i\}_{i=0}^{N}\), we can obtain the camera source \(\mathbf F \) by

$$\begin{aligned} \mathbf F =\frac{\sum \mathbf{I _i\cdot \mathbf R _i}}{\sum \mathbf{I _{i}^2}} \end{aligned}$$
(2)

The similarity \(\mathbf s _{i,j}\) between two camera fingerprints \(\mathbf F _i\) and \(\mathbf F _j\) can be calculated by the normalized correlation as follow:

$$\begin{aligned} \mathbf s _{i,j}=\frac{(\mathbf F _i - \bar{\mathbf{F }}_i)\cdot (\mathbf F _j - \bar{\mathbf{F }}_j)}{||\mathbf F _i - \bar{\mathbf{F }}_i|| \cdot ||\mathbf F _j - \bar{\mathbf{F }}_j||} \end{aligned}$$
(3)

where \(\bar{\mathbf{F _i}}\) and \(\bar{\mathbf{CF _j}}\) are the means of \(\mathbf F _i\) and \(\mathbf F _j\), respectively. If the score is greater than a predefined threshold, the two images set \(\mathcal {I}_i\) and \(\mathcal {I}_j\) are considered that they are captured by the same camera device.

Fig. 2.
figure 2

The scatter plot to verify the proposed assumption. The spot in both diagrams is the correlation values among any pairs of images’ residual noise. The orange spot means the two images have the same camera source (positive pair), and the blue spot in the right represents the images pair has different camera source. As we can see, high values (beyond the red line) are always achieved by the positive pairs and few negative pairs beyond it. (Color figure online)

3.2 User Feature Extraction Algorithm

Given a set of images \(\mathcal {I}=\{\mathbf {I}_i\}_{i=0}^{N}\), the residual noise set \(\mathcal {R}=\{\mathbf {R}_i\}_{i=0}^{N}\) can be obtained. Then, a pairwise similarity matrix \(\mathbf S \) is calculated with Eq. 3, where \(\mathbf S _{i,j}\) denotes the similarity between residual \(\mathbf R _i\) and \(\mathbf R _j\). The purpose algorithm is to segment the \(\mathcal {I}\) based on their camera fingerprints \(\mathcal {R}\) and pairwise similarities \(\mathbf S \). The proposed algorithm is based on the hierarchical algorithm and contains three main steps, i.e. seeds selection, group merger, and residual assignment. The detail is introduced below.

Seeds Selection. The seeds selection step is to select initial segmentation of a user’s album. As we mentioned before, the residual noise can be viewed as a noisy camera fingerprint, and the similarity between them is also noise so that the relationship of camera source cannot be surely determined. However, we observe an important phenomenon, which the large similarity can correctly predict the residuals with same camera source. In specific, if the correlation of two images’ residual noises is high enough; the two images have a high probability of being captured by the same camera. To verify that, the pairwise similarities of 1,576 images of 11 cameras is calculated. For the same camera pair, we denote the similarity as the positive points, and others are negative points. They are drawn in Fig. 2. As we can see, the negative points show a more narrow range than positive scatters, and the points which are larger than 0.02 can be taken as positive points. Therefore, for the residual pairs whose similarities are larger than 0.02, we gather them into the same subset and take them as the initialized seed.

Group Merger. In the above step, plenty of seeds set can be obtained. However, the camera source of these subsets is not unique to each other. It is obvious that there are many positive pairs come from the same camera source. Therefore, merging these subsets can further restrain the noise of the camera fingerprint of each subset. Toward this end, a similar strategy to seed selection is employed here to merge consistent groups. In particular, given any two clusters \(\mathbf C _{j}\), \(\mathbf C _{k}\), we merge them into one cluster if the correlation value \(\mathbf S _{ij}\) is greater than a 0.02. Formally, the updating procedure can be formulated as

$$\begin{aligned} \mathbf C _{i} = \mathbf C _{j} \bigcup \mathbf C _{k}, \ if \ \mathbf S _{i,j} > 0.02, \end{aligned}$$
(4)

Residual Assignment. After the above two steps, we can obtain several small subsets. However, there are plenty of images not assigned. To assign these scatters into the correct subset, an iterative scheme is proposed. The residual noise of an image can be viewed as a camera fingerprint corrupted by the noise. To suppress the other noise and obtain a more reliable camera fingerprint, we need to collect more images with the same camera source. When more images are collected, a fine camera fingerprint can be obtained. For now, several subsets with multiple images are obtained, and we can get a relatively fine camera fingerprint for each of them. In this way, the similarities between residual images and the subsets’ camera fingerprints may exceed the preset threshold. To avoid the extra error, we only add the residual image to a subset if they have the largest similarity than the similarities between other residual image and this subset. Therefore, for M camera fingerprints \(\{\mathbf {F}_1, ..., \mathbf {F}_M\}\), a residual noise \(\mathbf R _i\) should be assign to jth subset based on

(5)

where \(\mathbf S _{i,j}\) is the similarity value between \(\mathbf R _i\) and \(\mathbf F _j\).

Note that the camera fingerprint may be a strength after assigned new member, the similarity values between the current subsets may exceed than the preset threshold. It means that the subsets can be further merged since the strength of the camera fingerprints. Therefore, after adding a single residual image to each current subset, the group merger step will be processed. The iteration will not be terminated until there are no similarities between the residual images and subsets’ camera fingerprints are larger than the preset threshold. After the terminated, we take all the camera fingerprints of every subset and the residual noises of every residual image as the user features. Algorithm 1 shows the detail of the proposed framework.

figure a

3.3 Feature Refinement

With the proposed feature extraction approach, several subsets and a plenty of images can be obtained, and their camera fingerprints and residual noise is taken as the users’ feature. However, they may be confused by the re-posted images. The reposted images is the images which is captured by other person and posted by the current user. The shared camera fingerprint of reposted images cannot prove that the owner of these accounts have offline relationship. Therefore, eliminating the confusion of repost images is necessary to our purpose.

Distinguishing which images is the reposted image seems an impossible task. However, an important pattern of the reposted images is that they have been uploaded to the Internet many times. Considering the double compression processing provided by the service providers, the quality of an reposted image is very low. Fortunately, the camera fingerprint component exists in an image may be corrupted by the double compression, and the residual noises reposted images probably show weak response to the correlations. Therefore, for the image whose similarities with others is very low, it can be viewed as the potential reposted images.

Fig. 3.
figure 3

The framework overview of proposed user feature extraction algorithm.

In fact, small similarities with other data points indicate that they cannot be assigned to any exists subset. In other word, the reposted images probably exist in the residual images. Therefore, by discard all the residual images, we can restrain the confusion caused by the reposted images.

3.4 Similarity Estimation

Above algorithm can extract a fine user fingerprint under the challenges of multiple cameras and reposted images. How to measure the similarity between two users will be discussed in this part.

Given two accounts \(\mathbf A _i\) and \(\mathbf A _j\), \(\mathcal {C}_i\) and \(\mathcal {C}_j\) obtained by above algorithm are denoted to the features of \(\mathbf A _i\) and \(\mathbf A _j\), respectively. If \(\mathbf A _i\) and \(\mathbf A _j\) belong to the same person, their camera fingerprint sets may have an intersection. Therefore, the value of most similarity camera fingerprints from each feature is taken as the similarity between two users. Specifically, the mathematical expression is represented as follows:

$$\begin{aligned} \mathbf d (\mathbf {A}_i, \mathbf {A}_j)=\mathop {\arg \max }_{l,k} \frac{(\mathbf {C}_l^{(i)} - \bar{\mathbf {C}_l^{(i)}})\cdot (\mathbf {C}_k^{(j)} - \bar{\mathbf {C}_k^{(j)}})}{||\mathbf {C}_l^{(i)} - \bar{\mathbf {C}_l^{(i)}}|| \cdot ||\mathbf {C}_k^{(j)} - \bar{\mathbf {C}_k^{(j)}}||} \end{aligned}$$
(6)

where \(\mathbf d (A_i, A_j)\) denotes the similarity between \(A_i, A_j\); \(\mathbf C _l^{(i)}\) and \(\mathbf C _k^{(j)}\) denote the lth camera fingerprint of \(\mathcal {C}_i\) and kth camera fingerprint of \(\mathcal {C}_j\).

4 Experimental Evaluation

4.1 Settings

Before we present our experimental results, we will first introduce our settings, including dataset, metric, and baselines.

Dataset. The data set for user identification is difficult to obtain since these users are not fully labeled. Further, as our best knowledge, no public dataset has been released. Therefore, to verify the performance of user identification, we build several datasets based on a set of images. In specific, the base images comes from two sources, i.e. 1) 1,020 images with certain camera source (ORI) and 15,382 images crawled from 96 Flickr users’ albums (FLK). We summarize our datasets as follow:

  • Single: 20 users built with ORI. For the images of each camera, we split them into two parts and consider each of them is a user’s album. As a results, the users whose images come from the same camera source are viewed to share the identity.

  • Double: 10 users built with ORI. Each camera’s images are split into two parts, and each user’s album is consist of two parts image set come from two different cameras. For each user, there are two other users share the same camera source, which means these three users have the same identity.

  • Triple: 10 users built with ORI. Each camera’s images are divided into three parts, and three images sets from three camera sources consisted of a user’ album. For each user, at least three other users are labeled to the same identity.

  • Online: 192 users built with FLK set. For each Flickr user, we divided its album into two parts. Each part can be viewed to an independent account of the Filckr user, and the simulated users whose images come from the same Flickr account can be viewed to have the same identity.

Furthermore, to verify the influence of reposted images to the proposed scheme, we collect 102 images with the same camera source, and randomly assign them to all the simulated users in all above datasets. That means that all the users quote the collected camera’ images.

Metrics. To verify the performance of the proposed algorithm, Purity, Precision, Recall, and F1-measure is employed. Further, we exploit the ROC curve to measure the distinct ability of extracted users’ features. Finally, we treat the user identification as a retrieval problem, and the mAP (mean Average Precision) metric is exploited.

Baselines. To fully show the feasibility of the proposed framework, the above schemes are exploited as the baselines.

  • LtP: The proposed algorithm (LinktoPicture) in [1]. Given two users, the camera fingerprints of each user’s all images are extracted in the first. Then, the pairwise similarities between two users’ images are calculated and take the maximum value as the users’ similarity.

  • SCF: Single Camera Fingerprints. In this scheme, the multi-camera issue is not taken into account, which means that they assume that a single camera captures all the images of an account. Therefore, a mixed camera fingerprint obtained by all the images is taken as the user’ feature. The similarity between two camera fingerprints is considered as the similarity between two users.

  • MCF: Multiple Camera Fingerprints. The scheme is similar to the proposed method, except the feature refinement step. The scattered images are taken as one of the user’s camera fingerprint, and the reposted image issue is not taken into account.

4.2 Experimental Evaluation

Comparisons with the State-of-the-art Models. We treat the user identification to a retrieval problem, i.e. identify whether two users share the same cameras device. Therefore, the key to experiments is to evaluate the effectiveness of the proposed feature extraction algorithm. Toward this end, on all the built dataset, we compare our approach with the state-of-the-art schemes (i.e. LtP, SCF, and MCF). Table 1 shows experimental results, where the bold items denote the best performance and the runner-up are underlined.

Table 1. MAP of Users Identification

From Table 1, several conclusion can be obtained. First, the best performance is reported by the proposed scheme and MCF achieves the runner-up, which means that the multi-camera processing is an efficient step to user identification. Second, LtP achieves the worst performance and the reason mainly due to the confusion of reposted images. In our experiments, the highest similarity obtained by LtP always achieved by the reposted images, so that causes much false recognition. Therefore, restraining the reposted images is necessary for user identification problem. Furthermore, we find that MCF reports comparable results to the proposed method, and this indicates that the multi-camera processing step can restrain the confusion caused by reposted images. For example, some reposted images are mis-grouped to a group, and the confusion can be restraint by averaging the images in this group. Finally, we observe that the performance of the proposed algorithm shows a minor improvement to MCF, which means that the proposed refinement step is efficient.

Multi-camera Performance. Multi-camera processing is a necessary step in user identification, and its result may greatly influence the final performance of user identification. In this section, the purity, Precision, and Recall are exploited to evaluate the performance of the proposed scheme. Note that the images in Online don’t have camera labels, we only conduct the experiments on Single, Double and Triple datasets. See the results on Table 2.

Table 2. Multi-camera processing Performance

In Table 2, we observe that the proposed algorithm achieves high precision, that means most same camera source images are assigned to the same groups. It makes the result more robust, i.e. the outliers in groups may be restrained by averaging the camera fingerprints of other images. We also observe that the increasing number of camera source makes the dataset more difficult, and the proposed algorithm shows a worse performance on the Triple dataset.

Evaluate Reposted Images Removing. Another import issue of user identification is to restrain the confusion of reposted images. In proposed algorithm, we discard the scatters to remove the reposted images. Let us denote the reposted images number is \(N_{rep}\), and the removed images number is \(N_{rem}\). Assume k reposted images are correctly removed, we can obtain the true removed ratio \(\mathbf{TR} = \frac{k}{N_{rep}}\), miss removed ratio \(\mathbf{MR} = \frac{N_{rem} - k}{N_{rem}}\) and false removed ratio \(\mathbf{FR} = \frac{N_{rep} - k}{N_{rep}}\). In this section, we exploit above metrics to evaluate the performance of reposted images removing. See detail in Table 3.

Table 3. Evaluation of Removing Reposted images

In Table 3, we observe that more than half of the reposted images are correctly removed, which means that the proposed algorithm is efficient to restrain the reposted images. However, with the increasing of camera number, the missing discard images number is also increasing.

5 Conclusion

In this paper, we propose an algorithm to address the user identification problem. Different from previous work, we identify the users’ identity based on the camera fingerprint. The underlying idea is that the accounts with the same cameras devices may share the same identity in the offline world. Toward this end, a novel algorithm is proposed to extract the user’s feature, which can deal with the multi-camera and reposted images issue. The experimental results show that the proposed algorithm can deal with the multi-camera and reposted images issue, and efficiently tackle the user identification problem.