-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Autogenerate event pages #592
Autogenerate event pages #592
Conversation
**event | ||
} | ||
|
||
# Doesn't create a file if it's an external event, as external events should not have an id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's an example of an "external" event?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, AmericasNLP
. We don't have a page for it but we link the event externally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, let's say
# Doesn't create a file if it's an event we don't have a page for, which should not have an ID.
generate.py
Outdated
name = event['name'] | ||
|
||
# Start and end dates are necessary for listing the events on events.md | ||
if not isinstance(event['startDate'], str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really we should be checking that it's parsed as a date.
generate.py
Outdated
if not isinstance(event['startDate'], str): | ||
raise Exception(event['startDate']) | ||
|
||
if not isinstance(event['endDate'], str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe if endDate is missing, we assume it's just a single day event?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we can
_data/_events.json
Outdated
"location":"", | ||
"startDate":"", | ||
"endDate":"", | ||
"date":"", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this in addition to startDate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @tovmasharrison !
Is there a way to have events whose date is unknown (and thus have no info in the date column) manually placed in a certain row?
Right now, I have the WMT24 example, which appears in the middle of the table.
WMT events are usually at the end of the year, so I placed it as the first event because they are organized in reverse chronological order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi!
I'll take a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @cefoo,
Since we are auto-generating the events, there wasn't a straightforward way to place the WMT24 event manually at the top. However, I provided an approximate date for it based on the rest of the WMT events so it can be at the top until the actual date becomes known.
Please let me know if that solution is ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liashahnazaryan has just created the WMT24 page, with date on December.
However, it would be good to enforce WMT to be on top in future years, so this could be a good practice moving forward. WMT is usually on December, I guess we could just put December 31st until we know the real date. However, that date should be hidden until we know the real one.
_data/_events.json
Outdated
"linkUrl":[ | ||
"" | ||
], | ||
"impDatesHeader":"", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this?
If imp
stands for "important", let's just write importantDatesHeader
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we just want:
"importantDates": [ {
"name": "...",
"date": "..."
},
...
]
_data/_events.json
Outdated
"blockquoteSpeakersHeader":"", | ||
"speakHeadcontent":[ | ||
{ | ||
"bshOpeningSentence":"", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
_data/_events.json
Outdated
"" | ||
], | ||
"blockquoteSpeakersHeader":"", | ||
"speakHeadcontent":[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Content
with uppercase
_data/events.json
Outdated
"location":"Chicago, Illinois" | ||
}, | ||
{ | ||
"name":"<a href='https://turing.iimas.unam.mx/americasnlp/2024_workshop.html'>AmericasNLP</a>", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there should be a URL in the name.
_data/_events.json
Outdated
{ | ||
"name":"", | ||
"id":"", | ||
"title":"", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need both name and title?
_data/_events.json
Outdated
"date":"" | ||
} | ||
], | ||
"bulletKeyNoteSpeakOrTopsHeader":"", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, we should have this.
Instead of ""bulletKeyNoteSpeakOrTopsHeader" and "names", we need just
"organisers": [
{
"name": "...",
"institution": "..."
},
...
]
_data/_events.json
Outdated
"url":"" | ||
} | ||
], | ||
"sharedTasksHeader":"", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this only for WMT?
_data/_events.json
Outdated
"optionalSentence":"" | ||
} | ||
], | ||
"seo":{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be in JSON.
It should get generated in markdown.
The JSON schema needs to be way more minimal. We want to reduce the burden on people adding events. It's fine if we harmonise a bit how events are displayed. |
Having a JSON schema is a good idea. We should just use something like https://json-schema.org/, and then we can quickly run a validator. Instead of using ad-hoc schema format, and then still needing dozens of if-statements. |
005db5f
to
f368a03
Compare
Hey @bittlingmayer, I made quite a bit of changes.
Also, I have created a WMT Test event so we can see what it will look like. If everything is fine, I'll remove it. Please let me know what you think. |
_data/events.json
Outdated
"type":"Person", | ||
"url":"http://turing.iimas.unam.mx/americasnlp/st.html" | ||
}, | ||
"futureTenseOpeningParagraph": "The <strong>Fourth AmericasNLP Competition</strong> will take place online in June, at NAACL 2024 in Mexico City. The competition focused on creating machine translation systems for indigenous languages from the Americas.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we'd avoid this.
Also should we use Markdown instead of HTML?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we just remove future/past tenses and only keep opening_paragraph
? But this way, it should be modified manually when the event happens.
Well, each event page is rendered through HTML. I don't think there is a straightforward way to use Markdown syntax inside HTML?
_data/events.json
Outdated
"speakers": [ | ||
{ | ||
"type": "Keynote speakers", | ||
"about": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
feels wrong
i'd expect something more like this:
"speakers": [
{ name: "Graham Neubig", type: "keynote" },
{ name: "Jaime Pérez González", type: "keynote" },
"Manuel Mager",
"Abteen Ebrahimi",
"Shruti Rijhwani",
"Arturo Oncevay",
"Luis Chiruzzo",
"Robert Pugh",
"Katharina Kann"
]
though not sure if the schema will allow either string or object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The schema won't allow this since an array should contain elements of the same type.
Probably something like this can do:
{ "name": "Graham Neubig", "type": "keynote" },
{ "name": "Jaime Pérez González", "type": "keynote" },
{ "name": "Manuel Mager" },
{ "name": "Abteen Ebrahimi" },
{ "name": "Shruti Rijhwani" },
{ "name": "Arturo Oncevay" },
{ "name": "Luis Chiruzzo" },
{ "name": "Robert Pugh" },
{ "name": "Katharina Kann" }
]```
@bittlingmayer, What do you think?
languages/emj.md
Outdated
name: Yandex Translate | ||
supported_qe_apis: [] | ||
seo: | ||
name: Machine translation for None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like a bug
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'll fix that with a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in #598.
Yes, with schema is better! Made a few comments, but today is busy with the community meetup, will look more on the weekend. |
a369da7
to
0be7a7c
Compare
type: Organization | ||
name: European Association of Machine Translation | ||
url: https://eamt.org | ||
start_date: '2024-06-24' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can print these without quotes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only if it is easy - I know it is probably just the default in the conversion to YAML.
Description
The events can be autogenerated by adding WMT events to
wmt_events.json
and the rest toevents.json
. I have tried to automate the most commonly added information about the events. However, specific information that is not automated can be added manually to the markdown file after generating the event.The
_events.json
and_wmt_events.json
files include the empty structure of available fields that can be automated by choice. I have included them for reference.Also, I have created a list to show all the MT Summit, EAMT, and AMTA events separately at the top of the page. This closes that part in #584.
How it works:
Validation: After the events are added to the corresponding
.json
file,generate.py
validates the entries and creates a file inside theevents/
directory. For external events or for events where there's no need to create a page, the"id"
key-value pair should not be added. The examples are AMTA 2024, AmericasNLP and WMT24.events.md
: An appropriate header is created based on the event'sstartDate
, and each event is listed under the corresponding header appropriately. For example, if the event starts in 2024 or 2025, it will be listed under 2024 Events or 2025 Events.calls-for-papers.md
: If the event has a call for paper, then"callsForPapersDeadline": "some_date"
key value pair should be given inside the.json
file for it to be added tocalls-for-papers.md
or else it won't.Bold/unbold dates and names based on the deadline: The names and dates are in bold if the deadline has not been passed, and as soon as it passes they will automatically unbold. The check to bold/unbold will happen each time the website is generated, as it will compare the
endDate
with the current date.Past/future tense: When adding the event to the
.json
file, the future and past tenses can be added simultaneously. For the future tense, it can be added as"openingParagraph": "The event will take place"
, for the past tense, it can be added as"pastOpeningParagraph": "The event took place"
, and the appropriate tense will be automatically shown with the same bold/unbold logic mentioned above.Syntax: The markdown syntax should be replaced with HTML tags inside the
.json
file.<strong>Some word</strong> should be used to bold the name.
<a href='/link'>some_word</a> should be used to add a link.
Please let me know if you have any questions or suggestions.
Fixes #563
Checklist: