生成模型的评价指标

IS

Inception Score,根据InceptionNet-V3输出特征向量,输出向量每个维度代表属于某一类的概率,

对于单一的生成图像,Inceptoin输出的概率分布应该尽量小,越小说明生成图像越可能属于某个类别,图像的质量越高。

对于生成器生成一批图像而言,Inception输出的平均概率分布熵值应该尽量大,代表着生成器生成的多样性。

IS(G)=exp(E_{x~p_g}D_{KL}(p(y|x))||p(y))

FID

Frechet Inception Distance,IS是在Inception Net 结果上计算的,而FID是在Inception-V3模型高层特征上计算真假图片之间的距离,mu是经验均值,sigma是协方差,Tr是矩阵的迹,r是真实数据集,g是生成数据集

FID=||\mu_r-\mu_g||^2+Tr(\Sigma_r+\Sigma_g-2(\Sigma_r\Sigma_g)^{\frac{1}{2}})

如果原始数据有M张图片,我们得到它Mx2048的特征向量,然后生成模型生成了N张图片,经过Inception-V3得到Nx2048个特征向量组成的矩阵,使用这两个矩阵就可以计算mu和sigma,然后用上面的公式计算FID了。

# calculate frechet inception distance
def calculate_fid(act1, act2):
 # calculate mean and covariance statistics
 mu1, sigma1 = act1.mean(axis=0), cov(act1, rowvar=False)
 mu2, sigma2 = act2.mean(axis=0), cov(act2, rowvar=False)
 # calculate sum squared difference between means
 ssdiff = numpy.sum((mu1 - mu2)**2.0)
 # calculate sqrt of product between cov
 covmean = sqrtm(sigma1.dot(sigma2))
 # check and correct imaginary numbers from sqrt
 if iscomplexobj(covmean):
  covmean = covmean.real
 # calculate score
 fid = ssdiff + trace(sigma1 + sigma2 - 2.0 * covmean)
 return fid
  
# define two collections of activations
act1 = random(10*2048)
act1 = act1.reshape((10,2048))
act2 = random(10*2048)
act2 = act2.reshape((10,2048))
# fid between act1 and act1
fid = calculate_fid(act1, act1)
print('FID (same): %.3f' % fid)
# fid between act1 and act2
fid = calculate_fid(act1, act2)
print('FID (different): %.3f' % fid)

Clipscore

链接到:CLIP论文笔记 的最后,使用clip特征的余弦相似度

def calculate_clip_score(dataloader, model, real_flag, fake_flag):
    score_acc = 0.
    sample_num = 0.
    logit_scale = model.logit_scale.exp()
    for batch_data in tqdm(dataloader):
        real = batch_data['real']
        real_features = forward_modality(model, real, real_flag)
        fake = batch_data['fake']
        fake_features = forward_modality(model, fake, fake_flag)
        
        # normalize features
        real_features = real_features / real_features.norm(dim=1, keepdim=True).to(torch.float32)
        fake_features = fake_features / fake_features.norm(dim=1, keepdim=True).to(torch.float32)
        
        # calculate scores
        # score = logit_scale * real_features @ fake_features.t()
        # score_acc += torch.diag(score).sum()
        score = logit_scale * (fake_features * real_features).sum()
        score_acc += score
        sample_num += real.shape[0]
    
    return score_acc / sample_num

发表评论