Skip to main content

Add a script for cleaning up whitespace in Obsidian

ID
1dcc7c7
date
2023-12-26 09:51:45+00:00
author
Alex Chan <alex@alexwlchan.net>
parent
9684e7f
message
Add a script for cleaning up whitespace in Obsidian
changed files
2 files, 48 additions, 1 deletion

Changed files

text/README.md (5333) → text/README.md (5777)

diff --git a/text/README.md b/text/README.md
index 01140df..7f45da8 100644
--- a/text/README.md
+++ b/text/README.md
@@ -32,6 +32,12 @@ scripts = [
         """,
     },
     {
+        "usage": "fix_whitespace [PATH]",
+        "description": """
+        when I copy/paste text into Obsidian from th web, this cleans up some of the extraneous whitespace.
+        """,
+    },
+    {
         "usage": "midline [PATH]",
         "description": "print the line in the middle of a file, e.g. if the file has 5 lines, it prints line 3"
     },
@@ -103,6 +109,14 @@ when I copy/paste a Twitter thread into Obsidian, this does some
 initial tidying up of the formatting for me.
 </dd>
 <dt>
+<a href="https://github.com/alexwlchan/scripts/blob/main/text/fix_whitespace">
+<code>fix_whitespace [PATH]</code>
+</a>
+</dt>
+<dd>
+when I copy/paste text into Obsidian from th web, this cleans up some of the extraneous whitespace.
+</dd>
+<dt>
 <a href="https://github.com/alexwlchan/scripts/blob/main/text/midline">
 <code>midline [PATH]</code>
 </a>
@@ -179,4 +193,4 @@ read UTF-8 on stdin and print out the raw Unicode "
 "codepoints. This is a Docker wrapper around <a href="https://github.com/lunasorcery/utf8info">a tool of the same name</a> by @lunasorcery.
 </dd>
 </dl>
-<!-- [[[end]]] (checksum: afe7ca0c0f56a356286950da6b8332a3) -->
\ No newline at end of file
+<!-- [[[end]]] (checksum: 1ac648df181fdf4aea491bce61f10523) -->
\ No newline at end of file

text/fix_whitespace (0) → text/fix_whitespace (655)

diff --git a/text/fix_whitespace b/text/fix_whitespace
new file mode 100755
index 0000000..ff04394
--- /dev/null
+++ b/text/fix_whitespace
@@ -0,0 +1,33 @@
+#!/usr/bin/env python3
+"""
+Remove extra whitespace from a text file.
+
+When I copy/paste text from the web into Obsidian, it's often inserted
+with a lot of additional whitespace, e.g. text like
+
+    hello\n\nworld
+
+becomes
+
+    hello\n\n  \n\nworld
+
+This script cleans up some of this extraneous whitespace for me.
+"""
+
+import re
+import sys
+
+
+if __name__ == '__main__':
+    try:
+        path = sys.argv[1]
+    except IndexError:
+        sys.exit(f"Usage: {__file__} <path>")
+
+    with open(path, "r") as infile:
+        text = infile.read()
+
+    text = re.sub(r"\n\n\s*\n\n", "\n\n", text)
+
+    with open(path, "w") as outfile:
+        outfile.write(text)