Labels and problem with Classification model #91
Replies: 4 comments 2 replies
-
|
Hi, regarding your first question. If you have Regarding the losses: For a binary classification task you should use binary cross entropy loss, while for a multi-class classification task the generalisation of this function to many labels, known as categorical cross entropy. Have a look at lambeq's tutorial for classification for more information. Regarding your second question: From your description, it's really hard to say what could be wrong with your model. Your dataset seems very simple, and actually classifying correctly a sentence only depends on the last token (is it question mark or not). My understanding is that probably there's something wrong with your code. Here are a few suggestions:
Let us know how it goes. |
Beta Was this translation helpful? Give feedback.
-
|
hi @dimkart, Sorry for not answering you before. I wanted first to thank you, that tips about using a sequence model got the accuracy up to 80%. Again, Thanks a lot for help. |
Beta Was this translation helpful? Give feedback.
-
|
Yes, it's the default behaviour of Bobcat to ignore punctuation rules and tokens since they are not standard CCG. I've written a short method to show you how to fix this, by replacing the punctuation rule with backward application. from lambeq import CCGTree, CCGRule, BobcatParser, diagram2str
from discopy import Ty
def to_tree_with_punct(tree: CCGTree) -> CCGTree:
s = Ty('s')
if (len(tree.children) == 2 and tree.children[0].biclosed_type == s
and tree.children[1].biclosed_type == Ty('punc')):
tree.children[1].biclosed_type = s >> s
tree.rule = CCGRule.BACKWARD_APPLICATION
return tree
parser = BobcatParser()
# We now start by getting the CCG tree of the sentence,
# not directly the diagram.
t = parser.sentence2tree("What is the meaning of life ?")
print(t.deriv())Output (without using the method) is: Note the non-standard print(diagram2str(t.to_diagram()))Output: Now if you use the function: new_tree = to_tree_with_punct(t)
print(diagram2str(new_tree.to_diagram()))Output: Hope this helps. |
Beta Was this translation helpful? Give feedback.
-
|
Hi @dimkart Thanks for the method and the explanation. Thanks for the help. |
Beta Was this translation helpful? Give feedback.




Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi Lambeq community,
I started a small investigation project and wanted to include some quantum-base model to compare some results and study the current state. When starting building my own model using the tutorials provided I had two main doubts.
What is the idea behind defining the labels of the data in this two-dimensional binary way? and if I were to build a model for multiple labels, what would be the correct way to define the label for using lambeq?
I replicated the steps of the classical case but with my own dataset, which contains over 900 sentences classified in two categories: 1 - User is addresing the bot dirctly, 0 - User is not addresing the bot. The idea is to build a model using lambeq that is able to make this kind of classification. The problem is that I get really poor results (rarely above 55% accuracy), no matter how I set the hyperparemeters. Comparing with the example notebook, the only part that is changed is the addition of the atomic type: PREPOSITIONAL_PHRASE= Ty('p').
*Edit: Here is an example of the sentences I am working with:
0 i really do not like horror games .
0 I believe that exercise is crucial for staying healthy .
0 i love comic books they keep me entertain .
0 i like a lot of different music .
1 Do you like coffee ?
1 what is your name ?
1 can we talk about the batman film ?
1 What are your thoughts on animal testing ?
Any ideas/Tipps to improve my model?
Thanks a lot for help.
Beta Was this translation helpful? Give feedback.
All reactions