Skip to main content

Emacs: Auto-format strings to a max column width in `python-mode`

I use blacken-mode (uses black) to auto-format Python code on save -- it works great but not much helpful while dealing with strings. For example, if I want to set a max limit on columns (characters) per line when writing a string, black is of no help. So I needed to come up with something in emacs that does the auto-format of strings -- set a max limit on-the-fly on number of columns per line for writing strings.

I came up with the following elisp function to do the above:

(defun local-auto-format-max-column-width-string-python-mode (&optional max-column-width)
  "Auto-format lines containing strings in `python-mode` to `MAX-COLUMN-WIDTH' characters.

This uses the `current-column' function, which doesn't consider the
`column-number-indicator-zero-based' variable, so the column number starts from 0.

If not provided, the `MAX-COLUMN-WIDTH' defaults to 88 characters/columns.

As column numbers in emacs starts from 0, it does the check for the current line at
(+ `MAX-COLUMN-WIDTH' 1) to avoid side effects when jumping through the code rather
 than actually typing.

When the limit is reached:
- if the string is inside a function (e.g. `print'), it moves the start of the string
 (with any existing string-prefixes and quote) to the next line, then moves the `point'
 to the end
- if not inside a function, closes the current line at `MAX-COLUMN-WIDTH' with the
 initially used quote (and any existing string-prefixes), jumps to the correctly
 prefixed-quoted-indented next line, and sets the `point'

N.B: This doesn't handle triple-quoted strings at the moment."

  (when (equal major-mode 'python-mode)
    (setq max-column-width (or max-column-width 88))
    (when (equal (current-column) (+ max-column-width 1))
      (let
          ((selection)
           (quoted-string-obj)
           (current-string-quote)
           (indentation-and-quote))

        (setq selection (let ((this-line (thing-at-point 'line 'no-properties)))
                          (string-match "^\\(.+?\\)[[:blank:]]*\\'" this-line)
                          (match-string 1 this-line)))


        ;; Only do the auto-formatting when we're on an unclosed single/double quote (string)
        (when (string-match "^[^'\"]*\\(['\"]\\)[^'\"]+\\'" selection)
          (setq current-string-quote (match-string 1 selection))
          (cond
           ;; Cases like `func(spam="egg...`, `func(spam=egg, foo="bar...`, `func(\nspam=egg, foo="bar...`
           ((string-match "^[^,({\\[]+[,({\\[][[:blank:]]*\\(.*\\'\\)" selection)
            (setq quoted-string-obj (match-string 1 selection))
            (goto-char
             (- (point) (length quoted-string-obj)))
            (newline)
            (indent-for-tab-command)
            (end-of-line))

        ;; Cases like `func(\nspam="egg..."`
        ((string-match "^\\([[:blank:]]*\\)[^=]+=\\([[:blank:]]*.*\\'\\)" selection)
         (setq quoted-string-obj (match-string 2 selection))
         (goto-char
          (- (point) (length quoted-string-obj)))
         (insert "(")
         (newline)
         (end-of-line)
         (newline)
         (insert (concat (make-string (length (match-string 1 selection)) ? ) ")"))
         (previous-line)
         (beginning-of-line)
         (indent-for-tab-command)
         (end-of-line)
         (when (> (current-column) max-column-width)
           (setq selection (let ((this-line (thing-at-point 'line 'no-properties)))
                             (string-match "^\\(.+?\\)[[:blank:]]*\\'" this-line)
                             (match-string 1 this-line)))
           (when (string-match "^\\([[:blank:]]*\\(?:[A-Za-z]*\\(['\"]\\)\\)?\\)[^'\"]+\\'" selection)
             (setq indentation-and-quote (match-string 1 selection))
             (setq current-string-quote (match-string 2 selection)))
           (goto-char (- (point) (1+ (- (current-column) max-column-width))))
           (kill-line)
           (insert current-string-quote)
           (newline)
           (insert indentation-and-quote)
           (yank)))

        ;; Line containing only the string (possibly with preceding whitespaces)
        (t
         (goto-char (- (point) (if current-string-quote 2 1)))
         (kill-line)
         (when current-string-quote
           (insert current-string-quote))
         (newline)
         ;; Indentation
         (when (string-match "^\\([[:blank:]]*\\(?:[A-Za-z]*['\"]\\)?\\)[^'\"]+\\'" selection)
           (insert (match-string 1 selection)))
         (yank))))))))

This needs to be added as a post-command-hook so that every keystroke can be monitored to apply it on the right column:

(add-hook 'post-command-hook #'local-auto-format-max-column-width-string-python-mode)

Also, the above add-hook can only be run as a python-mode-hook to ensure that it's not invoked unnecessarily beforehand e.g.:

(add-hook 'python-mode-hook (lambda () (add-hook 'post-command-hook #'local-auto-format-max-column-width-string-python-mode)))

(I've used lambda as an example above but a named function rather than a lambda would be more appropriate as the hook function -- if we need to remove the hook at some point, the remove-hook function needs the exact function used in add-hook.)


Note: As a post-command-hook is run after every command, we need to make sure the hook is not clogging up the command loop of emacs. From my tests, I don't see any slowness or any other issues in emacs that could be relevant so works for me.


References:

  1. blacken-mode
  2. black
  3. post-command-hook
  4. add-hook
  5. remove-hook
  6. Python string prefixes

Comments

Comments powered by Disqus