IS
Inception Score,根据InceptionNet-V3输出特征向量,输出向量每个维度代表属于某一类的概率,
对于单一的生成图像,Inceptoin输出的概率分布应该尽量小,越小说明生成图像越可能属于某个类别,图像的质量越高。
对于生成器生成一批图像而言,Inception输出的平均概率分布熵值应该尽量大,代表着生成器生成的多样性。
IS(G)=exp(E_{x~p_g}D_{KL}(p(y|x))||p(y))
FID
Frechet Inception Distance,IS是在Inception Net 结果上计算的,而FID是在Inception-V3模型高层特征上计算真假图片之间的距离,mu是经验均值,sigma是协方差,Tr是矩阵的迹,r是真实数据集,g是生成数据集
FID=||\mu_r-\mu_g||^2+Tr(\Sigma_r+\Sigma_g-2(\Sigma_r\Sigma_g)^{\frac{1}{2}})
如果原始数据有M张图片,我们得到它Mx2048的特征向量,然后生成模型生成了N张图片,经过Inception-V3得到Nx2048个特征向量组成的矩阵,使用这两个矩阵就可以计算mu和sigma,然后用上面的公式计算FID了。
# calculate frechet inception distance
def calculate_fid(act1, act2):
# calculate mean and covariance statistics
mu1, sigma1 = act1.mean(axis=0), cov(act1, rowvar=False)
mu2, sigma2 = act2.mean(axis=0), cov(act2, rowvar=False)
# calculate sum squared difference between means
ssdiff = numpy.sum((mu1 - mu2)**2.0)
# calculate sqrt of product between cov
covmean = sqrtm(sigma1.dot(sigma2))
# check and correct imaginary numbers from sqrt
if iscomplexobj(covmean):
covmean = covmean.real
# calculate score
fid = ssdiff + trace(sigma1 + sigma2 - 2.0 * covmean)
return fid
# define two collections of activations
act1 = random(10*2048)
act1 = act1.reshape((10,2048))
act2 = random(10*2048)
act2 = act2.reshape((10,2048))
# fid between act1 and act1
fid = calculate_fid(act1, act1)
print('FID (same): %.3f' % fid)
# fid between act1 and act2
fid = calculate_fid(act1, act2)
print('FID (different): %.3f' % fid)
Clipscore
链接到:CLIP论文笔记 的最后,使用clip特征的余弦相似度
def calculate_clip_score(dataloader, model, real_flag, fake_flag):
score_acc = 0.
sample_num = 0.
logit_scale = model.logit_scale.exp()
for batch_data in tqdm(dataloader):
real = batch_data['real']
real_features = forward_modality(model, real, real_flag)
fake = batch_data['fake']
fake_features = forward_modality(model, fake, fake_flag)
# normalize features
real_features = real_features / real_features.norm(dim=1, keepdim=True).to(torch.float32)
fake_features = fake_features / fake_features.norm(dim=1, keepdim=True).to(torch.float32)
# calculate scores
# score = logit_scale * real_features @ fake_features.t()
# score_acc += torch.diag(score).sum()
score = logit_scale * (fake_features * real_features).sum()
score_acc += score
sample_num += real.shape[0]
return score_acc / sample_num