@recaptime-dev's working patches + fork for Phorge, a community fork of Phabricator. (Upstream dev and stable branches are at upstream/main and upstream/stable respectively.) hq.recaptime.dev/wiki/Phorge
phorge phabricator
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Transcode the HTML part of incoming email into UTF-8 as well

Summary:
D1093 did this for just the text/plain part of incoming
email. Most text/html parts choose to either use entity encoding
//or// are already UTF-8, thus obviating the need to transcode the
HTML part. However, this is not always the case, and leads to dropped
messages, by way of:

```
EXCEPTION: (Exception) Failed to JSON encode value (#5: Malformed UTF-8 characters, possibly incorrectly encoded): Dictionary value at key "html" is not valid UTF8, and cannot be JSON encoded: [snip HTML part of message content]```

Generalize the charset transcoding to not apply to just the text/plain part, but
both text/plain and text/html parts.

Test Plan:
Fed in a Windows-1252-encoded text/html part with 0x92
bytes in it; verified that $content only contained valid UTF-8 after
this change.

Reviewers: #blessed_reviewers, epriestley

Reviewed By: #blessed_reviewers, epriestley

Subscribers: Korvin, epriestley

Differential Revision: https://secure.phabricator.com/D18776

+14 -14
+14 -14
scripts/mail/mail_handler.php
··· 35 35 $parser = new MimeMailParser(); 36 36 $parser->setText(file_get_contents('php://stdin')); 37 37 38 - $text_body = $parser->getMessageBody('text'); 39 - 40 - $text_body_headers = $parser->getMessageBodyHeaders('text'); 41 - $content_type = idx($text_body_headers, 'content-type'); 42 - if ( 43 - !phutil_is_utf8($text_body) && 44 - (preg_match('/charset="(.*?)"/', $content_type, $matches) || 45 - preg_match('/charset=(\S+)/', $content_type, $matches)) 46 - ) { 47 - $text_body = phutil_utf8_convert($text_body, 'UTF-8', $matches[1]); 38 + $content = array(); 39 + foreach (array('text', 'html') as $part) { 40 + $part_body = $parser->getMessageBody($part); 41 + $part_headers = $parser->getMessageBodyHeaders($part); 42 + $content_type = idx($part_headers, 'content-type'); 43 + if ( 44 + !phutil_is_utf8($part_body) && 45 + (preg_match('/charset="(.*?)"/', $content_type, $matches) || 46 + preg_match('/charset=(\S+)/', $content_type, $matches)) 47 + ) { 48 + $part_body = phutil_utf8_convert($part_body, 'UTF-8', $matches[1]); 49 + } 50 + $content[$part] = $part_body; 48 51 } 49 52 50 53 $headers = $parser->getHeaders(); ··· 57 60 58 61 $received = new PhabricatorMetaMTAReceivedMail(); 59 62 $received->setHeaders($headers); 60 - $received->setBodies(array( 61 - 'text' => $text_body, 62 - 'html' => $parser->getMessageBody('html'), 63 - )); 63 + $received->setBodies($content); 64 64 65 65 $attachments = array(); 66 66 foreach ($parser->getAttachments() as $attachment) {