Adds workflow to build and deploy internal blog website powered by Hu…

…go and GitHub pages. (#13) * Adds workflow to build and deploy Hugo website. * Updates to run deploy on PR. * First blog post * Update path. * Adds cd blog. * Updates path again. * Adds PR template * Updates theame. * Create second-post.md --------- Co-authored-by: Yaroslav Biziuk <131271125+eLQeR@users.noreply.github.com>
COXIT-CO · Oct 17, 2024 · 3694db8 · 3694db8
1 parent 712efa5
commit 3694db8
Show file tree

Hide file tree

Showing 29 changed files with 1,762 additions and 0 deletions.
diff --git a/.github/pull_request_template.md b/.github/pull_request_template.md
@@ -0,0 +1,18 @@
+### What does it do?
+
+### What else do you need to know?
+
+### Checklist
+- [ ] Set base branch to `main/dev`
+- [ ] Pipeline set to `Review/QA`
+- [ ] Pull request is prepared: Added to sprint, added yourself as assignee, issue is linked, reviewers are added
+- [ ] PR is not very big (about 200-400 lines of change ideally)
+- [ ] PR/issue estimate is updated if significant time spent during PR review stage
+- [ ] Doesn't create any new `FIXME` or `TODO` comments. If not, please explain and create separate issue in backlog
+- [ ] Adds, updates tests
+- [ ] Adds blog post
+- [ ] Tested manually, test scripts etc
+- [ ] Any required documentation updates done (examples: **/*.md, blog).
+- [ ] Follows best practice rules: https://google.github.io/styleguide/pyguide.html
+
+### Demo screenshots or loom
diff --git a/.github/workflows/hugo.yaml b/.github/workflows/hugo.yaml
@@ -0,0 +1,84 @@
+# Sample workflow for building and deploying a Hugo site to GitHub Pages
+name: Deploy Hugo site to Pages
+
+on:
+  # Runs on pushes targeting the default branch
+  push:
+    branches:
+      - main
+      - dev
+  pull_request:
+      branches:
+        - main
+        - dev
+
+  # Allows you to run this workflow manually from the Actions tab
+  workflow_dispatch:
+
+# Sets permissions of the GITHUB_TOKEN to allow deployment to GitHub Pages
+permissions:
+  contents: read
+  pages: write
+  id-token: write
+
+# Allow only one concurrent deployment, skipping runs queued between the run in-progress and latest queued.
+# However, do NOT cancel in-progress runs as we want to allow these production deployments to complete.
+concurrency:
+  group: "pages"
+  cancel-in-progress: false
+
+# Default to bash
+defaults:
+  run:
+    shell: bash
+
+jobs:
+  # Build job
+  build:
+    runs-on: ubuntu-latest
+    env:
+      HUGO_VERSION: 0.134.2
+    steps:
+      - name: Install Hugo CLI
+        run: |
+          wget -O ${{ runner.temp }}/hugo.deb https://github.com/gohugoio/hugo/releases/download/v${HUGO_VERSION}/hugo_extended_${HUGO_VERSION}_linux-amd64.deb \
+          && sudo dpkg -i ${{ runner.temp }}/hugo.deb          
+      - name: Install Dart Sass
+        run: sudo snap install dart-sass
+      - name: Checkout
+        uses: actions/checkout@v4
+        with:
+          submodules: recursive
+          fetch-depth: 0
+      - name: Setup Pages
+        id: pages
+        uses: actions/configure-pages@v5
+      - name: Install Node.js dependencies
+        run: "[[ -f package-lock.json || -f npm-shrinkwrap.json ]] && npm ci || true"
+      - name: Build with Hugo
+        env:
+          HUGO_CACHEDIR: ${{ runner.temp }}/hugo_cache
+          HUGO_ENVIRONMENT: production
+          TZ: America/Los_Angeles
+        run: |
+          cd blog \
+          && hugo \
+            --gc \
+            --minify \
+            --baseURL "${{ steps.pages.outputs.base_url }}/"          
+      - name: Upload artifact
+        uses: actions/upload-pages-artifact@v3
+        with:
+          path: ./blog/public
+
+  # Deployment job
+  deploy:
+    environment:
+      name: github-pages
+      url: ${{ steps.deployment.outputs.page_url }}
+    runs-on: ubuntu-latest
+    needs: build
+    steps:
+      - name: Deploy to GitHub Pages
+        id: deployment
+        uses: actions/deploy-pages@v4
diff --git a/.gitmodules b/.gitmodules
@@ -0,0 +1,6 @@
+[submodule "blog/themes/ananke"]
+	path = blog/themes/ananke
+	url = https://github.com/theNewDynamic/gohugo-theme-ananke.git
+[submodule "blog/themes/papermod"]
+	path = blog/themes/papermod
+	url = https://github.com/adityatelange/hugo-PaperMod
diff --git a/blog/archetypes/default.md b/blog/archetypes/default.md
@@ -0,0 +1,5 @@
++++
+title = '{{ replace .File.ContentBaseName "-" " " | title }}'
+date = {{ .Date }}
+draft = true
++++
diff --git a/blog/content/posts/my-first-post.md b/blog/content/posts/my-first-post.md
@@ -0,0 +1,52 @@
++++
+title = 'Welcome To LLM Integrations Research Blog by COXIT'
+date = 2024-10-10T19:19:14+03:00
+draft = false
+author = 'Iryna Mykytyn'
++++
+
+### Welcome 
+Welcome to the internal COXIT blog about Prompt Engeneering and LLM integrations.
+
+This is COXIT's internal R&D project. We aim to investigate different tools, approaches, and prompt engineering techniques for building LLM-powered production-ready solutions. We aim to try different tools currently known on the market and define what works and what causes any problems for our specific use cases.
+
+As a result of this project, we will build a knowledge base describing which of the available LLM evaluation tools, frameworks, LLMs themselves, and prompt engineering techniques worked best for our specific LLM-related cases, explain what didn't work, and why.
+
+
+To make a new post, install [hugo](https://gohugo.io/getting-started/installing/), then run:
+```bash
+cd blog
+hugo mod get -u # fetch the theme module
+hugo new posts/my-first-post.md #adjust filename to match title of your post
+hugo server -D
+```
+
+### Tech Stack
+
+This blog is powered by [hugo](https://gohugo.io/) + github actions + github pages. 
+
+Pages are simple markdown, so github preview functionality is sufficient.
+
+My favorite features of this setup:
+- static, secure autogenerated without any infra outside of github repo
+- the client-side [search](/search/) (see [index.json](/index.json)) powered by [fuse.js](https://www.fusejs.io/).
+- single binary to download to edit on local computer
+- Option to avoid installing hugo locally and just create+edit pages via edit-on-github UX.
+- Markdown
+
+Cons:
+- Using github auth prevents rss :(
+
+### Commands used to setup blog
+
+This blog was setup using:
+```bash
+hugo new site blog
+cd blog
+git submodule add https://github.com/theNewDynamic/gohugo-theme-ananke.git themes/ananke
+echo "theme = 'ananke'" >> hugo.toml
+```
+Using instructions from:
+- https://github.com/adityatelange/hugo-PaperMod/wiki/Installation
+- https://gohugo.io/getting-started/quick-start/
+- https://gohugo.io/hosting-and-deployment/hosting-on-github/
diff --git a/blog/content/posts/second-post.md b/blog/content/posts/second-post.md
@@ -0,0 +1,105 @@
++++
+title = 'Prompt Engineering through Structured Instructions and Advanced Techniques'
+date = 2024-10-10T11:59:14+03:00
+draft = false
+author = 'Yaroslav Biziuk'
++++
+
+[Prompt Engineering through Structured Instructions and Advanced Techniques is described on Google Docs with photo exapmles](https://docs.google.com/document/d/10pz3nPghcG3tyN9RuzrNerfcbeP59kq1YJRXjBApTQY/edit)
+
+# 1. Introduction
+Language models (LLMs) are powerful tools for a variety of tasks, but their effectiveness is highly dependent on the design of prompts. This article examines advanced techniques in prompt engineering, focusing on the impact of instruction order, the "Ask Before Answer" technique, and the "Chain of Thoughts" (CoT) method, etc. By optimizing these factors, we can significantly enhance the accuracy and reliability of LLM outputs.
+
+# 2. The Importance of Instruction Order
+Instruction order plays a crucial role in prompt engineering. Altering the sequence of instructions or actions can drastically change the outcome produced by the LLM. For instance, when we previously placed an instruction about not considering "semi-exposed surfaces" as the eleventh step, the LLM would still process these surfaces as it followed each step sequentially, reaching the exclusion instruction too late to apply it effectively. However, when this instruction was moved to precede all other steps, the LLM correctly disregarded "semi-exposed" surfaces. This demonstrates the necessity of positioning general concepts or definitions above the specific step-by-step instructions, ensuring they are applied throughout the process.
+
+Example:
+<img width="1249" alt="image" src="https://github.com/user-attachments/assets/98f52cab-00d8-4ba5-80c0-330772d97d87">
+
+
+# 3. The "Ask Before Answer" Technique
+The "Ask Before Answer" technique is particularly effective when optimizing prompts to reduce the occurrence of hallucinations. By prompting the LLM to seek clarification before resolving a task, we can preempt misunderstandings that might lead to incorrect answers.
+
+**Example Prompt:**
+
+```
+You are a Casework expert. You review the specification and follow the instruction to pick the correct Series option which represents cabinet materials and thickness:
+
+OPTIONS: {OPTIONS} 
+
+INSTRUCTION: {INSTRUCTION}
+
+SPECIFICATION: {INPUT_TEXT}
+
+If you have questions or misunderstandings, ask about it before resolving the task. 
+Before proceeding with the itinerary, please ask for any clarifications or additional details. 
+I will give more info if you need.
+```
+Result:
+<img width="1208" alt="image" src="https://github.com/user-attachments/assets/02c32f33-4a91-49b0-88ff-bcd1877f62e1">
+
+
+When applying this technique, we ask the LLM to identify specific areas where it may be uncertain or confused in resolving a test case. By doing so, we can pinpoint where hallucinations occur, understand why the LLM struggles with certain choices, and refine the prompt in those areas where the model tends to get confused. This method is highly effective in improving the quality of the instructions provided in the prompt.
+
+# 4. The Chain of Thoughts (CoT) Technique
+
+**Result without CoT:**
+<img width="1208" alt="image" src="https://github.com/user-attachments/assets/69624492-d2d9-4a7f-9813-ce36e4a935ec">
+
+
+One of the most critical steps in creating an effective prompt with complex instructions is the use of the Chain of Thoughts (CoT) technique. By including phrases like "You think step by step," "Take your time," or "Explain every step," the LLM is given time to reflect and process all input data. This approach significantly improves the results, making them more logical and coherent. However, caution is needed when using "Explain every step," as the LLM can sometimes provide the most likely correct answers without fully understanding why, leading to hallucinations.
+
+**Result with CoT:**
+<img width="1325" alt="image" src="https://github.com/user-attachments/assets/0f056075-239f-434d-b34e-fce3f5d4bf32">
+
+
+# 5. Meta-Prompting: An Advanced Strategy in Prompt Engineering
+Meta-prompting is an advanced technique in prompt engineering that goes beyond merely guiding a language model (LLM) through a task. It involves crafting prompts that instruct the model on how to think or approach a problem before the primary task begins. This strategic layer of prompting enhances the LLM's ability to navigate complex instructions by embedding a meta-level understanding of the task at hand. For example, instead of directly asking the LLM to solve a problem, a meta-prompt might instruct the model to first assess whether it fully understands the task, identify potential ambiguities, and request clarification if necessary.
+
+When applied in Claude LLM, meta-prompting proved to be more effective than in GPT models. It significantly improved test case outcomes by making instructions simpler and clearer for the model to understand. Claude's advanced processing capabilities allowed it to better interpret and act on the meta-prompts, leading to more accurate and consistent results.
+
+**Example how meta-prompting optimized Claude outputs:**
+<img width="1222" alt="image" src="https://github.com/user-attachments/assets/39ea1f66-9536-4cee-b848-c45daa351a47">
+
+
+However, in our specific case, meta-prompting did not lead to the exceptional results we had hoped for. While it is a valuable technique, its effectiveness can vary depending on the complexity of the task and the model's inherent capabilities.
+
+# 6. Explanatory Instructions
+One key insight in optimizing prompt engineering is the importance of providing explanations for why certain choices should be made. Adding reasoning behind instructions helps LLMs make better-informed decisions, thereby improving their overall performance.
+
+**For example:**
+
+- **Worse:** "If both HPL and TFL can be used, choose TFL."
+- **Better:** "If both HPL and TFL can be used, choose TFL as the more cost-effective option."
+
+In the "better" example, the LLM is not only told which option to choose but also why that choice is preferable. This additional context helps the model understand the underlying logic and apply it more consistently across different scenarios.
+
+# 7. Simplify Your Instructions
+When writing instructions for LLMs, it’s crucial to keep them clear and simple. If people find it hard to read and understand the instructions, the model will struggle even more. Use plain language and short sentences, as if you’re explaining things to a child.
+
+For instance:
+
+- **Worse:** "If front materials are not specified or not matched with TFL surfaces, HPL, or A-Tech Surface, choose HPL as the default option."
+- **Better:** "If fronts are not specified, default to HPL unless TFL or A-Tech Surface is mentioned."
+
+In the "better" example, the instructions are straightforward and easier to understand.
+
+# 8. Imagine You're Training a Colleague
+When crafting prompts, it's essential to approach the task as if you are training a child or a new colleague at work to complete a specific task. To achieve this, you need to provide the individual with sufficient context and detailed, clear instructions.
+
+# 9. The Role in Prompting
+An important aspect of creating effective prompts is assigning a specific role to the language model (LLM). This role definition helps limit the data the LLM will use to solve the task and sets the appropriate tone and format for the response.
+
+**Example:**
+
+- **Worse:** "You are a Casework expert."
+- **Better:** "You are a Casework expert tasked with reviewing a specification and selecting the correct Series option that represents cabinet materials and thickness."
+
+This role-based approach provides the LLM with a clear context and specific guidelines, which positively influences the accuracy and relevance of the model's responses.
+
+# 10. Conclusion
+In prompt engineering, the careful structuring of instructions, along with techniques like "Meta-prompting" and "Chain of Thoughts," can dramatically enhance the performance of language models. These methods help in reducing hallucinations, improving clarity, and ensuring more accurate outcomes.
+
+# 11. References
+- [Prompt Engineering for ChatGPT: A Comprehensive Guide](https://medium.com/@seyibello31/prompt-engineering-for-chatgpt-a-comprehensive-guide-6650cdf0a047)
+- [The Prompt Report: A Systematic Survey of Prompting Techniques](https://arxiv.org/pdf/2406.06608)
diff --git a/blog/hugo.toml b/blog/hugo.toml
@@ -0,0 +1,4 @@
+baseURL = 'https://coxit.co/'
+languageCode = 'en-us'
+title = 'LLM Integrations Research Blog by COXIT'
+theme = 'papermod'