Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel letter frequency #422

Merged
merged 27 commits into from
Jul 3, 2024
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
2b607d4
add parallel-letter-frequency, part of 48 in 24
kmarker1101 Jun 24, 2024
5e42d75
practice-exercise-generator: Don't include commas in test names (#420)
fapdash Jun 19, 2024
a095956
fix merge conflict
kmarker1101 Jun 24, 2024
ca7f180
fix json formating issue
kmarker1101 Jun 24, 2024
71baba9
fix merge issue
kmarker1101 Jun 24, 2024
55ef6a2
fix merge issue
kmarker1101 Jun 24, 2024
645a481
refactor to use parallism
kmarker1101 Jun 25, 2024
a79d868
check if ci failure was a fluke
kmarker1101 Jun 25, 2024
bb11561
fix ci failure
kmarker1101 Jun 25, 2024
e81e9d6
add empty string check
kmarker1101 Jun 25, 2024
b7138b8
fix merge conflict
kmarker1101 Jun 25, 2024
40ea8eb
return hash-table, update difficulty
kmarker1101 Jun 27, 2024
d03620f
cleanup, fix error
kmarker1101 Jun 27, 2024
f2018cf
add instructions.append
kmarker1101 Jun 27, 2024
1359c1b
Merge branch 'main' into parallel-letter-frequency
kmarker1101 Jun 27, 2024
dffc508
reformat instructions.append.md
kmarker1101 Jun 29, 2024
d374db3
Use extra file for subprocess code
fapdash Jun 30, 2024
183b6ba
bin/test-examples: Handle testing of exercises with additional files
fapdash Jun 30, 2024
af9b857
Don't use additional solution file
fapdash Jun 30, 2024
292d4db
add BNAndras and fapdash as contributors
kmarker1101 Jun 30, 2024
fae9f65
Use printed representation for hash table (de)serialization
fapdash Jul 1, 2024
24268e0
Revert changes to `bin/test-examples`
fapdash Jul 3, 2024
776d523
Merge branch 'main' into parallel-letter-frequency
kmarker1101 Jul 3, 2024
3366db3
Merge branch 'exercism:main' into parallel-letter-frequency
kmarker1101 Jul 3, 2024
0a72e9c
Merge branch 'parallel-letter-frequency' of github.com:kmarker1101/em…
kmarker1101 Jul 3, 2024
154b002
refactor tests
kmarker1101 Jul 3, 2024
2d1942f
Typo fix
BNAndras Jul 3, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions config.json
Original file line number Diff line number Diff line change
Expand Up @@ -894,6 +894,14 @@
"practices": [],
"prerequisites": [],
"difficulty": 2
},
{
"slug": "parallel-letter-frequency",
"name": "Parallel Letter Frequency",
"uuid": "3daf3903-1eb0-49b9-827a-76e6b7ca25fb",
"practices": [],
"prerequisites": [],
"difficulty": 8
}
]
},
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Instructions

Count the frequency of letters in texts using parallel computation.

Parallelism is about doing things in parallel that can also be done sequentially.
A common example is counting the frequency of letters.
Employ parallelism to calculate the total frequency of each letter in a list of texts.
17 changes: 17 additions & 0 deletions exercises/practice/parallel-letter-frequency/.meta/config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
{
"authors": [
"kmarker1101"
],
"files": {
"solution": [
"parallel-letter-frequency.el"
],
"test": [
"parallel-letter-frequency-test.el"
],
"example": [
".meta/example.el"
]
},
"blurb": "Count the frequency of letters in texts using parallel computation."
}
89 changes: 89 additions & 0 deletions exercises/practice/parallel-letter-frequency/.meta/example.el
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
;;; parallel-letter-frequency.el --- Parallel Letter Frequency (exercism) -*- lexical-binding: t; -*-

;;; Commentary:

;;; Code:

(require 'cl-lib)

(defun clean-text (text)
"Clean TEXT by removing numbers, punctuation, and whitespace, keeping only alphabetic characters and converting to lowercase."
(downcase (replace-regexp-in-string "[^[:alpha:]]" "" text)))

(defun combine-frequencies (freqs-list)
"Combine a list of frequency hash tables in FREQU-list into a single hash table."
(let ((combined-freqs (make-hash-table :test 'equal)))
(dolist (freqs freqs-list)
(maphash (lambda (key value)
(puthash key (+ value (gethash key combined-freqs 0)) combined-freqs))
freqs))
combined-freqs))

(defun serialize-hash-table (hash-table)
"Serialize HASH-TABLE to a string."
(let (result)
(maphash (lambda (key value)
(push (format "%c:%d" key value) result))
hash-table)
(string-join (reverse result) ",")))

(defun deserialize-hash-table (str)
"Deserialize STR to a hash table."
(let ((hash-table (make-hash-table :test 'equal)))
(unless (string= str "") ; Check if the string is empty
(dolist (item (split-string str ","))
(let ((pair (split-string item ":")))
(puthash (string-to-char (nth 0 pair))
(string-to-number (nth 1 pair))
hash-table))))
hash-table))

(defun calculate-frequencies (texts)
"Calculate letter frequencies for each string in TEXTS using processes."
(let ((cleaned-texts (mapcar #'clean-text texts)))
(if (cl-every #'string-empty-p cleaned-texts)
'() ;; Return empty list if all cleaned texts are empty
(let* ((num-processes (min (length cleaned-texts) (max 1 (string-to-number (shell-command-to-string "nproc")))))
(texts-per-process (ceiling (/ (float (length cleaned-texts)) num-processes)))
(results (make-hash-table :test 'equal))
(pending num-processes)
(final-result nil)
(processes nil))
;; Create processes
(dotimes (i num-processes)
(let* ((start-index (* i texts-per-process))
(end-index (min (* (1+ i) texts-per-process) (length cleaned-texts)))
(process-texts (if (< start-index (length cleaned-texts))
(cl-subseq cleaned-texts start-index end-index)
'())))
(when (not (null process-texts))
(let* ((command (format "(princ (let ((freqs (make-hash-table :test 'equal))) (dolist (text '%S) (let ((text-freqs (make-hash-table :test 'equal))) (dolist (char (string-to-list text)) (when (string-match-p \"[[:alpha:]]\" (char-to-string char)) (puthash char (1+ (gethash char text-freqs 0)) text-freqs))) (maphash (lambda (key value) (puthash key (+ value (gethash key freqs 0)) freqs)) text-freqs))) (let (result) (maphash (lambda (key value) (push (format \"%%c:%%d\" key value) result)) freqs) (string-join (reverse result) \",\")))))"
process-texts))
(process (make-process
:name (format "letter-freq-process-%d" i)
:buffer (generate-new-buffer (format " *letter-freq-process-%d*" i))
:command (list "emacs" "--batch" "--eval" command)
:sentinel (lambda (proc _event)
(when (eq (process-status proc) 'exit)
(with-current-buffer (process-buffer proc)
(let ((result (deserialize-hash-table (buffer-string))))
(maphash (lambda (key value)
(puthash key (+ value (gethash key results 0)) results))
result))
(setq pending (1- pending))
(when (= pending 0)
(setq final-result (combine-frequencies (list results)))
(let ((sorted-result nil))
(maphash (lambda (key value)
(push (list (char-to-string key) value) sorted-result))
final-result)
(setq sorted-result (sort sorted-result (lambda (a b) (string< (car a) (car b)))))
(setq final-result (apply 'append sorted-result))))))))))
(push process processes)))))
;; Wait for all processes to finish
(while (> pending 0)
(sleep-for 0.1))
final-result))))

(provide 'parallel-letter-frequency)
;;; parallel-letter-frequency.el ends here
49 changes: 49 additions & 0 deletions exercises/practice/parallel-letter-frequency/.meta/tests.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# This is an auto-generated file.
#
# Regenerating this file via `configlet sync` will:
# - Recreate every `description` key/value pair
# - Recreate every `reimplements` key/value pair, where they exist in problem-specifications
# - Remove any `include = true` key/value pair (an omitted `include` key implies inclusion)
# - Preserve any other key/value pair
#
# As user-added comments (using the # character) will be removed when this file
# is regenerated, comments can be added via a `comment` key.

[c054d642-c1fa-4234-8007-9339f2337886]
description = "no texts"

[818031be-49dc-4675-b2f9-c4047f638a2a]
description = "one text with one letter"

[c0b81d1b-940d-4cea-9f49-8445c69c17ae]
description = "one text with multiple letters"

[708ff1e0-f14a-43fd-adb5-e76750dcf108]
description = "two texts with one letter"

[1b5c28bb-4619-4c9d-8db9-a4bb9c3bdca0]
description = "two texts with multiple letters"

[6366e2b8-b84c-4334-a047-03a00a656d63]
description = "ignore letter casing"

[92ebcbb0-9181-4421-a784-f6f5aa79f75b]
description = "ignore whitespace"

[bc5f4203-00ce-4acc-a5fa-f7b865376fd9]
description = "ignore punctuation"

[68032b8b-346b-4389-a380-e397618f6831]
description = "ignore numbers"

[aa9f97ac-3961-4af1-88e7-6efed1bfddfd]
description = "Unicode letters"

[7b1da046-701b-41fc-813e-dcfb5ee51813]
description = "combination of lower- and uppercase letters, punctuation and white space"

[4727f020-df62-4dcf-99b2-a6e58319cb4f]
description = "large texts"

[adf8e57b-8e54-4483-b6b8-8b32c115884c]
description = "many small texts"
Loading