-
Notifications
You must be signed in to change notification settings - Fork 48
Open
Description
Hi! I revisited the code. I think there might be something that can be refined here. As for the dtw function, when two or all of option_diag, option_up and option_left are equal, there might be some bugs since we will actually get several optimal solution for the dynamic programming problem. We can take this as an example:
Base_to_Blending:
[0],"[1, 2, 3]",[4],[5],[6],[7],"[8, 9]",[10],[11],"[12, 13]",[14],[15],"[16, 17, 18, 19]",[20],[21],"[22, 23]",[24],[25],[26],"[27, 28, 29, 30, 31]",[32],[33],[34],[35],"[36, 37]",[38],[39],[40],[41],[42],[43],[44],[45],"[46, 47]",[48],[49],[50],[51],[52],[53],[54],[55],[56],[57],[58],[59],[60],[61],"[62, 63]",[64],[65],[66],[67],"[68, 69]",[70],"[71, 72]",[73],[74],[75],[76],[77],"[78, 79, 80]",[81],[82],[83],[84],"[85, 86]",[87],"[88, 89]",[90],[91],[92],"[93, 94]",[95],"[96, 97]",[98],[99],[100],[101],[102],[103],[104],[105],[106],[107],[108],[109],[110],[111],[112],[113],[114],[115],[116],[117],[118],[119],"[120, 121]",[122],[123],[124],[125],"[126, 127]",[128],"[129, 130]",[131],[132],[133],"[134, 135]",[136],[137],[138],[139],"[140, 141]",[142],[143],[144],"[145, 146]",[147],[148],[149],[150],[151],[152],"[153, 154]",[155],[156],[157],"[158, 159]",[160],[161],[162],[163],[164],[165],[166],[167],"[168, 169]",[170],[171],[172],[173],[174],[175],[176],[177],[178],[179],"[180, 181]",[182],[183],[184],[185],[186],"[187, 188]",[189],"[190, 191]",[192],[193],[194],[195],[196],[197],[198],[199],"[200, 201]",[202],[203],[204],[205],[206],[207],[208],[209],[210],[211],[211],[212],[213],"[214, 215]",[216],[217],[218]
Base_Input_Tokens:
HT,C,'s,ĠVive,ĠPro,Ġheadset,Ġis,Ġavailable,Ġto,Ġpre,-order,Ġfor,Ġ$,799,ĊĊ,We,'ve,Ġseen,Ġplenty,Ġof,ĠBeats,-focused,ĠK,IR,Fs,Ġin,Ġour,Ġtime,",",Ġsome,Ġbetter,Ġthan,Ġothers,.,ĠFew,",",Ġhowever,",",Ġplay,Ġquite,Ġso,Ġdirectly,Ġon,Ġthe,Ġname,Ġas,ĠOrig,Audio,'s,ĠBe,ets,.,ĠFor,Ġ$,25,",",Ġadopt,ers,Ġget,Ġa,Ġset,Ġof,Ġheadphones,Ġthat,Ġbear,Ġlittle,Ġdirect,Ġresemblance,Ġto,ĠDr,.,ĠDre,'s,Ġaudio,Ġgear,Ġof,Ġchoice,",",Ġbut,Ġare,Ġno,Ġdoubt,Ġbound,Ġto,Ġimpress,Ġfriends,Ġ--,Ġat,Ġleast,",",Ġup,Ġuntil,Ġthey,Ġsee,Ġa,Ġroot,Ġvegetable,Ġlogo,Ġinstead,Ġof,Ġa,Ġlower,-case,ĠB,.,ĠThankfully,",",Ġthere,'s,Ġmore,Ġto,Ġit,Ġthan,Ġjust,Ġamusing,Ġand,Ġconfusing,Ġpeers,.,ĠEvery,Ġpurchase,Ġwill,Ġlead,Ġto,Ġa,Ġdonation,Ġof,Ġcanned,Ġbe,ets,Ġ(,what,Ġelse,?),Ġto,Ġthe,ĠSecond,ĠHarvest,ĠFood,ĠBank,Ġof,ĠOrange,ĠCounty,.,ĠFor,Ġus,",",Ġthat,'s,Ġreason,Ġenough,Ġto,Ġhope,Ġthat,ĠBeats,Ġdoesn,'t,Ġput,Ġthe,Ġk,ib,osh,Ġon,ĠOrig,Audio,'s,Ġeffort,.,ĠBesides,",",Ġwe,Ġcould,Ġuse,Ġsome,Ġaccom,pan,iment,Ġfor,Ġour,ĠBeet,Box,.,<|eot_id|>
Blending_Input_Tokens:
▁HT,C,',s,▁V,ive,▁Pro,▁head,set,▁is,▁available,▁to,▁pre,-,order,▁for,▁$,7,9,9,<0x0A>,<0x0A>,We,',ve,▁seen,▁plenty,▁of,▁Be,ats,-,f,oc,used,▁K,IR,F,s,▁in,▁our,▁time,",",▁some,▁better,▁than,▁others,.,▁F,ew,",",▁however,",",▁play,▁quite,▁so,▁directly,▁on,▁the,▁name,▁as,▁Orig,Audio,',s,▁Be,ets,.,▁For,▁$,2,5,",",▁ad,op,ters,▁get,▁a,▁set,▁of,▁head,ph,ones,▁that,▁bear,▁little,▁direct,▁res,embl,ance,▁to,▁Dr,.,▁Dre,',s,▁audio,▁g,ear,▁of,▁choice,",",▁but,▁are,▁no,▁doubt,▁bound,▁to,▁impress,▁friends,▁--,▁at,▁least,",",▁up,▁until,▁they,▁see,▁a,▁root,▁veget,able,▁logo,▁instead,▁of,▁a,▁lower,-,case,▁B,.,▁Thank,fully,",",▁there,',s,▁more,▁to,▁it,▁than,▁just,▁am,using,▁and,▁confusing,▁pe,ers,.,▁Every,▁purchase,▁will,▁lead,▁to,▁a,▁don,ation,▁of,▁can,ned,▁be,ets,▁(,what,▁else,?),▁to,▁the,▁Second,▁Har,vest,▁Food,▁Bank,▁of,▁Orange,▁County,.,▁For,▁us,",",▁that,',s,▁reason,▁enough,▁to,▁hope,▁that,▁Be,ats,▁doesn,',t,▁put,▁the,▁k,ib,osh,▁on,▁Orig,Audio,',s,▁effort,.,▁Besides,",",▁we,▁could,▁use,▁some,▁accompan,iment,▁for,▁our,▁Be,et,Box,.,</s>
TinyLlama_MiniThinky_one example.csv
Obviously, we know that the 1st(index starting from zero) base_token 'C' should only align with the 1st blending_token 'C'. However, based on dtw, we can see that the result is 'C' in base_token aligns with three blending_token ' ' ' , 's' , '▁V' which is incorrect. The underlying reason is what I mentioned above. But I don't really figure out how to improve it.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels