indolem.github.io/index.html at master · indolem/indolem.github.io · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
---
layout: default
---
<style>
  .column {
    float: left;
    width: 50%;
  }

  /* Clear floats after the columns */
  .row:after {
    content: "";
    display: table;
    clear: both;
  }
</style>

<div class="page clearfix" index>

    <div class="left" style="width: 100%; margin: 0 auto">
        <div class="row">
          <div class="column">
            <img src="images/indolem.png" height="100%" width="75%" alt="Indolem Logo">
          </div>
          <div class="column" style="float: right">
          </div>
        </div>
    </div>
    <hr>
    <div class="left" style="width: 100%; margin: 0 auto">
      <div>
      <span style="font-size:110%">Selamat Datang! </span>
      <p style="font-size:95%">
      </br>
      IndoLEM (<b>Indo</b>nesian <b>L</b>anguage <b>E</b>valuation <b>M</b>ontage) is a comprehensive Indonesian NLP dataset encompassing a broad range of morpho-syntactic,
      semantic, and discourse analysis competencies. Like <a href="https://www.gluebenchmark.com" target="_blank"><b>GLUE Benchmark</b></a>, The purpose of IndoLEM
      is to benchmark progress in Indonesian NLP. The tasks in IndoLEM can be categorized into one of these followings:
      </p>
      </div>
      <ul>
        <li style="font-size:95%">&#9830; Morpho-syntax and Sequence Labelling Tasks</li>
            <ul>
              <li style="font-size:90%"> Part-of-speech (POS) tagging</li>
              <li style="font-size:90%"> Named entity recognition (NER)</li>
              <li style="font-size:90%"> Dependency parsing</li>
            </ul>
        <li style="font-size:95%">&#9830; Semantic Tasks</li>
            <ul>
              <li style="font-size:90%"> Sentiment Analysis</li>
              <li style="font-size:90%"> Summarization</li>
            </ul>
        <li style="font-size:95%">&#9830; Discourse Coherence Tasks</li>
            <ul>
              <li style="font-size:90%"> Next Tweet Prediction</li>
              <li style="font-size:90%"> Tweet Ordering</li>
            </ul>
      </ul>
      <span style="font-size:95%"> <b>Paper:</b> </span>
      <p style="font-size:90%">
        Fajri Koto, Afshin, Rahimi, Jey Han Lau, and Timothy Baldwin.
        <a href="https://www.aclweb.org/anthology/2020.coling-main.66.pdf" target="_blank"><i>IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained
            Language Model for Indonesian NLP</i></a>. In Proceedings of the 28th COLING, December 2020.
      </p>
    </div>
</div>