License ? #1

suprafun · 2023-12-31T21:53:19Z

Hello, this seems like a very interesting project. Under what license is it released under ? Do you have a paper or blog post that describes your motivations and findings ? If not then I encourage you to write at least a blog post. I am sure many people would be interested in reading it.

JRC1995 · 2024-01-01T02:41:47Z

Thank you for your interest.

From my side of the codes, I am willing to allow MIT license. I didn't explicitly include any license, because some data-processing codes (utils.py, answer_extraction.py) are borrowed from: https://github.com/AGI-Edgerunners/Plan-and-Solve-Prompting (which don't have any explicit license if I didn't miss it), also I don't know about the exact licenses of the data. In short, I can cover anything else besides those under MIT license.

This work is mostly in continuation with "decomposition + self-evaluation-guided-search" style approaches:

This paper is a key inspiration: https://arxiv.org/abs/2305.00633 (also this: https://arxiv.org/abs/2305.14992 (and another related codebase: https://github.com/Ber666/llm-reasoners)), but there are other works like Tree of Thoughts and several works on "factored cognition"/Decomposed QA - and recent works on chain of verification and such which are similarish (see: https://arxiv.org/abs/2307.11768).

However, one key difference is that I am exploring the above in a zero shot regime (whereas most others do few shot). One convenience of few-shot approach is that you can define a structured way to answer through few-shot examples, making decomposition of reasoning steps straight-forward (for the above papers). However, I wanted to explore a bit more on pure zero-shot prompting. In some contexts, we may not have exact good input-output examples in mind and may want the LLM to provide some insights from pre-trained corpus, for example (if you don't know much about the task). This, however, means that I cannot provide examples of how to decompose reasoning steps. So the project kind of involves possible ways to dempose in a zero-shot manner (example, by providing very precise instructions, clever prompting, or cleverly using newlines etc.), and then exploring various ways to search + collating multi-sample results (inspired from self-consistency: https://arxiv.org/abs/2203.11171).

I may work on a paper involving some these ideas sometime in a next few months. Currently, I don't have any interesting finding. Generally, I discovered that the simple auto COT (COT+STEP) + self-consistency baseline beats most other methods that I used when using LLAMA-instruct 30B model and some others: https://docs.google.com/document/d/1OoWczZVkRjLzXpgr7VCr9izkEVLDhM118AedD0G6zOg/edit?usp=sharing

Although there may be other avenues to explore like possibility to use rewards for filtering wrong results (testing precision/recall with that) or some form of faithfulness tests. I may also play around with some of the newer models (SOLAR, MixTral, Zephyr etc.)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License ? #1

License ? #1

suprafun commented Dec 31, 2023

JRC1995 commented Jan 1, 2024 •

edited

Loading

License ? #1

License ? #1

Comments

suprafun commented Dec 31, 2023

JRC1995 commented Jan 1, 2024 • edited Loading

JRC1995 commented Jan 1, 2024 •

edited

Loading