Torge Matthies : gitlab: Make mbox splitting more robust.

Alexandre Julliard julliard at winehq.org
Wed Jul 6 11:57:18 CDT 2022


Module: tools
Branch: master
Commit: 1dd718b79d3511e7e9d614fc9c6d96be4ffcc655
URL:    https://source.winehq.org/git/tools.git/?a=commit;h=1dd718b79d3511e7e9d614fc9c6d96be4ffcc655

Author: Torge Matthies <openglfreak at googlemail.com>
Date:   Fri Jun 24 07:16:35 2022 +0200

gitlab: Make mbox splitting more robust.

This complements the fix in af62ee218ed25a1502490a1017cf53d23cf65307
by not splitting on every line that starts with From.
The mbox split algorithm now uses the same From-line detection as git
mailsplit.

---

 gitlab/gitlab-to-mail/gitlabtomail.py | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/gitlab/gitlab-to-mail/gitlabtomail.py b/gitlab/gitlab-to-mail/gitlabtomail.py
index 8333588e..03824504 100755
--- a/gitlab/gitlab-to-mail/gitlabtomail.py
+++ b/gitlab/gitlab-to-mail/gitlabtomail.py
@@ -278,10 +278,12 @@ def decode_rfc2822(header):
 
 def split_mbox_into_messages(mbox):
     result = []
-    for mail in re.split(r'^From ', mbox, flags=re.M):
+    # Ported from Git's mailsplit code to regex.
+    regex = r'^(?=.{20,})(?=From .*\d{2}:\d{2}:\d{2}\s*(?:9[1-9]|\d{3,})(?!\d)[^:\n]*$)'
+    for mail in re.split(regex, mbox, flags=re.M|re.A):
         if not mail:
             continue
-        result.append(email.message_from_string('From ' + mail))
+        result.append(email.message_from_string(mail))
     return result
 
 




More information about the wine-cvs mailing list