Correspondingly, class 0 has probability 0.8. 0.2, meaning that the probability of the instance being class 1 is 0.2. In binary cross-entropy, you only need one probability, e.g. Binary cross-entropy loss computes the cross-entropy for classification problems where the target class can be only 0 or 1. The problem you are having is that PyTorch's BCELoss computes the binary cross-entropy loss, which is formulated differently. More specifically, that line is saying that for this one data instance, class 0 has probably 0.2, class 1 has probability 0.2, and class 2 has probability 0.6. You are implicitly modeling a multi-class classification problem where the target class can be one of three classes (the length of that tensor). In your example where you provide the line y = tf.convert_to_tensor() See this StackOverflow post for more information. Where p is the target distribution and q is your predicted distribution. Indeed, its definition is exactly the equation that you provided: It is used to compute the loss between two arbitrary probability distributions. The fundamental problem is that you are incorrectly using the BCELoss function.Ĭross-entropy loss is what you want. Is there any function can calculate the correct cross entropy in Pytorch, using the first formula, just like CategoricalCrossentropy in Tensorflow? Here is the example with the same setting and we can see the cross entropy is calculated in the same way as the first formula. In constrast, Tensorflow can calculate the correct cross entropy value with CategoricalCrossentropy. Because each one of these dimensions involves the -(1-yi)*log(1-pi) term. If we sum the results of all the dimensions, the final cross entropy doesn't correspond to the expected one. import torch.nn as nnĪnd the result is tensor() And that make the result different from the correct one. Here is an example using BCELoss and we can see the second term is involved in each dimension's result. The expected formula to calculate the cross entropy isīut BCELoss calculates the BCE of each dimension, which is expressed as -yi*log(pi)-(1-yi)*log(1-pi)Ĭompared with the first equation, the term -(1-yi)*log(1-pi) should not be involved. BCELoss seems to work but it gives an unexpected result. The basic loss function CrossEntropyLoss forces the target as the index integer and it is not eligible in this case. If I want to calculate the cross entropy between 2 tensors and the target tensor is not a one-hot label, which loss should I use? It is quite common to calculate the cross entropy between 2 probability distributions instead of the predicted result and a determined one-hot label. I am confused about the calculation of cross entropy in Pytorch.
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |