Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add nemoto2024majority #152

Merged
merged 2 commits into from
Jan 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions content/publication/kitada2021making/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,3 +70,5 @@ slides: ""
---

The manuscript accepted for publication in the Applied Intelligence journal can be found [here](/publication/kitada2022making).

{{< blogcard url="https://arxiv.org/abs/2104.08763v1" >}}
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
72 changes: 72 additions & 0 deletions content/publication/nemoto2024majority/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
---
# Documentation: https://hugoblox.com/docs/managing-content/

title: "Majority or Minority: Data Imbalance Learning Method for Named Entity Recognition"
authors: ["Sota Nemoto", "Shunsuke Kitada", "Hitoshi Iyatomi"]
date: 2024-01-25T23:20:23+09:00
doi: ""

# Schedule page publish date (NOT publication's date).
publishDate: 2024-01-25T23:20:23+09:00

# Publication type.
# Legend: 0 = Uncategorized; 1 = Conference paper; 2 = Journal article;
# 3 = Preprint / Working Paper; 4 = Report; 5 = Book; 6 = Book section;
# 7 = Thesis; 8 = Patent
publication_types: ["article"]

# Publication name and optional abbreviated publication name.
publication: ""
publication_short: ""

abstract: "Data imbalance presents a significant challenge in various machine learning (ML) tasks, particularly named entity recognition (NER) within natural language processing (NLP). NER exhibits a data imbalance with a long-tail distribution, featuring numerous minority classes (i.e., entity classes) and a single majority class (i.e., O-class). The imbalance leads to the misclassifications of the entity classes as the O-class. To tackle the imbalance, we propose a simple and effective learning method, named majority or minority (MoM) learning. MoM learning incorporates the loss computed only for samples whose ground truth is the majority class (i.e., the O-class) into the loss of the conventional ML model. Evaluation experiments on four NER datasets (Japanese and English) showed that MoM learning improves prediction performance of the minority classes, without sacrificing the performance of the majority class and is more effective than widely known and state-of-the-art methods. We also evaluated MoM learning using frameworks as sequential labeling and machine reading comprehension, which are commonly used in NER. Furthermore, MoM learning has achieved consistent performance improvements regardless of language, model, or framework.
"

# Summary. An optional shortened abstract.
summary: ""

tags: ["Preprint"]
categories: ["Natural Language Processing", "Named Entity Recognition", "Imbalanced Dataset"]
featured: false

# Custom links (optional).
# Uncomment and edit lines below to show custom links.
links:
- name: Preprint
url: https://arxiv.org/abs/2401.11431
icon_pack: ai
icon: arxiv

url_pdf:
url_code:
url_dataset:
url_poster:
url_project:
url_slides:
url_source:
url_video:

# Featured image
# To use, add an image named `featured.jpg/png` to your page's folder.
# Focal points: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight.
image:
caption: ""
focal_point: ""
preview_only: true

# Associated Projects (optional).
# Associate this publication with one or more of your projects.
# Simply enter your project's folder or file name without extension.
# E.g. `internal-project` references `content/project/internal-project/index.md`.
# Otherwise, set `projects: []`.
projects: []

# Slides (optional).
# Associate this publication with Markdown slides.
# Simply enter your slide deck's filename without extension.
# E.g. `slides: "example"` references `content/slides/example/index.md`.
# Otherwise, set `slides: ""`.
slides: ""
---

{{< blogcard url="https://arxiv.org/abs/2401.11431" >}}
Loading