Guo et al. (2020) |
Deep Semantic Compliance Advisor for Unstructured Document Compliance Checking |
Stanford Natural Language Inference (SNLI) dataset (open-sourced), a real English contract data (NOT open source) |
Graph Neural Network,attention-based RNN |
It takes a legal professional 4+ hours for each contract checking, DSCA can return the checking results with detail comparison info in one minute. |
\ |
Unstructured document checking, sentiment analysis |
IJCAI-20 |
Guo et al. (2020) |
IGNITE: A Minimax Game Toward Learning Individual Treatment Effects from Networked Observational Data |
Create semi-synthesis data to mimic the real-world situation (NOT open-sourced) |
\ |
\ |
\ |
Learn Individual Treatment Effects (ITEs) from network information |
Eco |
Wang & Zhu (2020) |
Interpretable Multimodal Learning for Intelligent Regulation in Online Payment Systems |
WeChat Pay of Tencent (NOT open source) |
Attention mechanism |
85.9% Accuracy, and triplet loss is 0.01 lower than baseline model |
01/07/2019- 31/08/2019 |
Try to investigate the relationship between transactions and texts on e-commerce system |
IJCAI-20 |
David et al. (2020) |
Leveraging Contextual Text Representations for Anonymizing German Financial Documents |
Bundesanzeiger11 (BANZ) (Open sourced) |
Bi-directional Character-based Recurrent Neural Network |
98.9% Precision, 0.973 Recall, 0.972 F1 |
\ |
App of anonymizing the sensitive components in financial document |
AAAI-20 |
Kiyoshi et al. (2020) |
Economic News Impact Analysis, Using Causal-Chain Search from Textual Data |
Tokyo Stock Exchange (open sourced) |
Casual Chain Search VS Absolute Return in Stock Market |
Both related (Using similarity of AR) |
01/10/2012- 31/05/2018 |
We created lists of related companies and measured impacts on those stock prices for the two important news about a wheat price in 2018. As a result, the market impacts appeared in the companies related to the ripple effects when the news is about the obvious fact |
AAAI-20 |
Edminston et al. (2020) |
Unsupervised Discovery of Firm-Level Variables in Earnings Call Transcript Embeddings |
Compustat |
SAFE - Graph Algorithm |
SAFE Score |
Q1-2020 |
Repurpose algorithm from computational biology. Compares embedding methods across economic variables. |
FinNLP-2020 |
Taylor & Keselj (2020) |
Using Extractive Lexicon-based Sentiment Analysis to Enhance Understanding of the Impact of Non-GAAP Measures in Financial Reporting |
McDonald (2019) 10-K |
\ |
Hypothesis Test |
1998-2019 |
First to use extractive approach for sentiment analysis in Finance |
FinNLP-2020 |
Chen & Sarkar (2020) |
A Semantic Approach to Financial Fundamentals |
Stage One 10-X Parse Data |
BERT |
Cross-industry variation |
2006-2018 |
Introduces the Semantically-Informed Financial Index |
FinNLP-2020 |
Bambrick et al. (2020) |
NSTM: Real-Time Query-Driven News Overview Composition at Bloomberg |
Not OS |
NSTM |
User feedback |
\ |
Developed a novel system that composes concise and human readable news overviews given arbitrary user search queries. |
ACL-2020 |
Zheng et al. (2019) |
Doc2EDAG: An End-to-End Document-level Framework for Chinese Financial Event Extraction |
Chinese Financial Announcements |
Doc2EDAG |
Precision, Recall, F1 |
2008-2018 |
New model to directly generate event tables. Reformalise DEE task without trigger words. New real-world dataset. |
EMNLP-2019 |
Moreno-Sandoval et al. (2019) |
Tone Analysis in Spanish Financial Reporting Narratives |
ORBIS & Annual Reports |
Lexicon/Rule-based |
F1, Accuracy, Precision, Recall |
2014-2017 |
First corpus of "letters to shareholders" in Spanish. Created a gold standard to evaluate opinion systems. |
2019 (FNP) |
Tian & Peng (2019) |
Finance document Extraction Using Data Augmentation and Attention |
\ |
Attention-based LSTM |
Weighted F1 |
\ |
Title detection using attention based LSTM |
2019 (FNP) |
Blumenthal & Graf (2019) |
Utilizing Pre-Trained Word Embeddings to Learn Classification Lexicons with Little Supervision |
SST-2 & FNHL |
Neural Network |
Accuracy |
\ |
Present a novel method to learn classification lexicons from a labeled text corpus that incorporates word sim- ilarities in the form of pre-trained word em- beddings |
2019 (FNP) |
Gooding & Briscoe (2019) |
Active Learning for Financial Investment Reports |
All Street Research |
Linear SVC |
F1-Score |
\ |
Built a classification pipeline to categorise investment-related content. |
2019 (FNP) |
Chen et al. (2019) |
Numeracy-600K: Learning Numeracy for Detecting Exaggerated Information in Market Comments |
Reuters |
BiGRU, LR, CNN… |
F1-Score |
\ |
Providing novel challenge and dataset. Set strong baseline. |
ACL-2019 |
Dereli & Saraclar (2019) |
Convolutional Neural Networks for Financial Text Regression |
10-K Data - Tsai et al. (2016) |
CNN |
Spearmans Rank Correlation |
\ |
Reduced dependencies on lexicon. |
ACL-2019 |
Sedinkina et al. (2019) |
Automatic Domain Adaptation Outperforms Manual Domain Adaptation for Predicting Financial Outcomes |
H4N and L&M |
OLS |
t-statistic, R^2 |
\ |
Automatic domain adaptation of lexicons outperforms manual. |
ACL-2019 |
Chung-chi et al. (2017) |
NLG301 at SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News |
SemEval-2017 Task 5 |
SVM |
Cosine Similarity |
01/01/2015 - 31/10/2016 |
Text Span, Ensemble. |
SemEval-2017 Task 5 |
Chung-chi et al. (2018) |
Fine-Grained Analysis of Financial Tweets |
FiQA 2018 Task 1 |
CNN / Bi-LSTM / CRNN |
Accuracy/MSE/R2 |
/ |
Aspect, Extension Dataset |
FiQA 2018 Task 1 |