Slide to be parsed
Here's the breakdown of the text runs
Text Before: Token fun
Text Before: Sometimes a [@Token] string will get broken up.
Text Before: When spell check flags a misspelled word the
Text Before: [@
Text Before: MisspelledToken
Text Before: ] gets split between multiple text blocks
Text Before: Sometimes a [@Token] string will get broken up.
Text Before: When spell check flags a misspelled word the
Text Before: [@
Text Before: MisspelledToken
Text Before: ] gets split between multiple text blocks
As you can see there is a token that gets spanned across multiple text runs and thus doesn't get replaced. This can be a major hang up. A quick solution for this one example is that I know when there is a start-token at the end of a text run then the next couple text runs are the token itself and then the end-token. So I can append those three together and then replace the token as necessary.
void SubstituteText(SlidePart slidePart, string tokenStart, string tokenEnd, string tokenValue, string subValue)
{
string token = tokenStart + tokenValue + tokenEnd;
int tokenStartId = 0;
//Collect all the Paragraph sections containing the token
ListparagraphList = slidePart.Slide
.Descendants()
.Where(t => t.InnerText.Contains(token)).ToList();
foreach (Drawing.Paragraph p in paragraphList)
{
//Collect all the Text
ListtextList = p
.Descendants()
.ToList();
//Iterate and find tokenStart Text block or replace text if whole token found
foreach (Drawing.Text t in new List(p.Descendants ().ToList()))
{
if (t.Text.EndsWith(tokenStart))
{
tokenStartId = textList.IndexOf(t);
//append next two text segments and remove them
Drawing.Text appendText = textList[tokenStartId + 1];
t.Text = t.Text + appendText.Text;
//must remove at the Drawing.Run level
appendText.Parent.Remove();
appendText = textList[tokenStartId + 2];
t.Text = t.Text + appendText.Text;
//must remove at the Drawing.Run level
appendText.Parent.Remove();
textList.RemoveAt(tokenStartId + 1);
textList.RemoveAt(tokenStartId + 2);
}
//substitute text
t.Text = t.Text.Replace(token, subValue);
}
}
}
After running this I get the following output as desired.
.pptx after substitution
Breakdown of the text runs after substitution
Text After: Token fun
Text After: Sometimes a TOKEN string will get broken up.
Text After: When spell check flags a misspelled word the
Text After: BROKEN TOKEN gets split between multiple text blocks
Text After: Sometimes a TOKEN string will get broken up.
Text After: When spell check flags a misspelled word the
Text After: BROKEN TOKEN gets split between multiple text blocks
Now this doesn't cure all ails. I have also seen it where after going back and making changes to existing text that sometimes the start-token and token are together but the end-token is seperated to another text run. We also can't just take all the runs of a paragraph and smash them together into a single run and then substitute. This would ignore any text styling that was done within the paragraph. I have tried parsing a paragraph such that it looks at the RunProperties between consequetive runs to see if it's valid to append them together. I am still working on this and will present the final solution at a later time.
Hopefully this post has revealed some nuances of doing text substitution and that it isn't always straight forward.