-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Emily's internship blog #192
base: main
Are you sure you want to change the base?
Conversation
💖 Thanks for opening this pull request! 💖 |
@EmilyXinyi Could you maybe add a link to or embed your Vlog here? I think it would be a nice addition to this well written blogpost about your internship experience. |
I embedded the video for now, though I am not sure how good it would look on the blog because it's a portrait (vertical) video. Alternatively I can post it on some social media first and link it here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor edits
|
||
### Open Source Developement | ||
|
||
I started my contributions by adapting certain metrics (tweedie, mean absolute percentage error etc.) to be Array API compatible under the guidance of my mentor, Olivier. The Array API standard is a cross-library API for array operations on Python, which is designed to improve interoperability and consistency across different array libraries. This also means that scikit-learn algorithms written in NumPy for CPU can work on other hardwares (GPU) with PyTorch or CuPy, greatly improving performance. As I gained more familiarity with the scikit-learn codebase and Array API, I began working on adapting “larger” functions to be Array API compatible, which means a lot more fundamental, a lot more dependencies, a lot more challenging, and a lot more fun. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started my contributions by adapting certain metrics (tweedie, mean absolute percentage error etc.) to be Array API compatible under the guidance of my mentor, Olivier. The Array API standard is a cross-library API for array operations on Python, which is designed to improve interoperability and consistency across different array libraries. This also means that scikit-learn algorithms written in NumPy for CPU can work on other hardwares (GPU) with PyTorch or CuPy, greatly improving performance. As I gained more familiarity with the scikit-learn codebase and Array API, I began working on adapting “larger” functions to be Array API compatible, which means a lot more fundamental, a lot more dependencies, a lot more challenging, and a lot more fun. | |
I started my contributions by adapting certain metrics (tweedie, mean absolute percentage error etc.) to be Array API compatible under the guidance of my mentor, [Olivier](https://github.com/ogrisel). The Array API standard is a cross-library API for array operations on Python, which is designed to improve interoperability and consistency across different array libraries. This also means that scikit-learn algorithms written in NumPy for CPU can work on other hardwares (GPU) with PyTorch or CuPy, greatly improving performance. As I gained more familiarity with the scikit-learn codebase and Array API, I began working on adapting “larger” functions to be Array API compatible, which means a lot more fundamental, a lot more dependencies, a lot more challenging, and a lot more fun. |
|
||
### Chinese Community Outreach | ||
|
||
China has the second largest user group of scikit-learn. As a community, we believe that we can be more inclusive to ease Chinese contribution and do what is necessary to recruit more Chinese contributors. Therefore, I need to find out who and where scikit-learn is being used, if there are other platforms (outside of GitHub) that development is happening, because GitHub tends to be very slow in China, and establish scikit-learn’s official presence in the Chinese community. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
China has the second largest user group of scikit-learn. As a community, we believe that we can be more inclusive to ease Chinese contribution and do what is necessary to recruit more Chinese contributors. Therefore, I need to find out who and where scikit-learn is being used, if there are other platforms (outside of GitHub) that development is happening, because GitHub tends to be very slow in China, and establish scikit-learn’s official presence in the Chinese community. | |
China has the second largest user group of scikit-learn according to documentation web analytics. As a community, we believe that we can be more inclusive to ease Chinese contribution and do what is necessary to onboard more Chinese contributors. Therefore, I need to find out who and where scikit-learn is being used, if there are other platforms (outside of GitHub) that development is happening, because GitHub tends to be very slow in China, and establish scikit-learn’s official presence in the Chinese community. |
|
||
I also had weekly Peer Programming sessions with Loïc and Stefanie, where my piled-up questions from the week outside of Array API would be answered, and I would almost always learn something new about developer tools or programming fundamentals. | ||
|
||
On the Chinese community outreach side, it has always been with the scikit-learn communications team. Here I must give a special shoutout to manager François, who is also part of the communications team, for always being supportive and believing in my outreach efforts, especially because I was nervous doing this kind of task and using Chinese in a professional context for the first time. I also got to interact with [Charlie](https://charlie-xiao.github.io/) (yes, the core-dev Charlie), who is located in China and helped me tremendously with tasks that require physical presence. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Chinese community outreach side, it has always been with the scikit-learn communications team. Here I must give a special shoutout to manager François, who is also part of the communications team, for always being supportive and believing in my outreach efforts, especially because I was nervous doing this kind of task and using Chinese in a professional context for the first time. I also got to interact with [Charlie](https://charlie-xiao.github.io/) (yes, the core-dev Charlie), who is located in China and helped me tremendously with tasks that require physical presence. | |
On the Chinese community outreach side, it has always been with the scikit-learn communications team. Here I must give a special shoutout to manager [François](https://www.linkedin.com/in/françois-goupil/), who is also part of the communications team, for always being supportive and believing in my outreach efforts, especially because I was nervous doing this kind of task and using Chinese in a professional context for the first time. I also got to interact with [Charlie](https://charlie-xiao.github.io/) (yes, the core-dev Charlie), who is located in China and helped me tremendously with tasks that require physical presence. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also seems that assets/videos/emily_blog_vid.MOV
can be removed now that we are using the TikTok link?
Blog post about Emily's summer internship
cc: @francoisgoupil