Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Howdy folks,
I have a situation where a text string is being imported that could contain hyperlinks (it might not contain any, or it could contain several). What I want to do is extract the text and format the link to be contained within brackets after the hyperlinked text. For example:
<p>This is some text</p>
<p><a href="http://www.google.com" rel="nofollow">Google link</a></p>
<p> </p>
<p><a href="http://www.bing.com/" rel="nofollow">Bing link</a></p>
This is some text
Google link (http://www.google.com)
Bing link (www.bing.com)
I can strip all the tags without issue, but I need to keep the destination URL as well and would like to format it nicely.
Is there a straightforward way to do this?
Solved! Go to Solution.
Hi @afhealey ,
You can splilt column many times by specific string and filter the specfic string to achieve this in power query. The whole query would be like this:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wiik1MDBOLgBTqSEZmcUKQFScn5uqUJJaUQKR1YdKx+ShqIZwEhUyilLTbGOUMkpKCqz09cvLy/XS8/PTc1L1kvNzY5QUilJzgLJ5+Wn5OTn55TEQG1PdwUoUcjLzsqGWJCKbisNKBTIdlJSZlw5yjj5O9zgBVRB0jVJsLAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Data = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Data", type text}}),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Changed Type", {{"Data", Splitter.SplitTextByDelimiter("<p>", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Data"),
#"Split Column by Delimiter1" = Table.ExpandListColumn(Table.TransformColumns(#"Split Column by Delimiter", {{"Data", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Data"),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter1",{{"Data", type text}}),
#"Filtered Rows" = Table.SelectRows(#"Changed Type1", each Text.Contains([Data], "href")),
#"Split Column by Delimiter2" = Table.ExpandListColumn(Table.TransformColumns(#"Filtered Rows", {{"Data", Splitter.SplitTextByDelimiter("=", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Data"),
#"Changed Type2" = Table.TransformColumnTypes(#"Split Column by Delimiter2",{{"Data", type text}}),
#"Filtered Rows1" = Table.SelectRows(#"Changed Type2", each not Text.Contains([Data], "href"))
in
#"Filtered Rows1"
Best Regards,
Community Support Team _ Yingjie Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Hi @afhealey ,
You can splilt column many times by specific string and filter the specfic string to achieve this in power query. The whole query would be like this:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wiik1MDBOLgBTqSEZmcUKQFScn5uqUJJaUQKR1YdKx+ShqIZwEhUyilLTbGOUMkpKCqz09cvLy/XS8/PTc1L1kvNzY5QUilJzgLJ5+Wn5OTn55TEQG1PdwUoUcjLzsqGWJCKbisNKBTIdlJSZlw5yjj5O9zgBVRB0jVJsLAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Data = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Data", type text}}),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Changed Type", {{"Data", Splitter.SplitTextByDelimiter("<p>", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Data"),
#"Split Column by Delimiter1" = Table.ExpandListColumn(Table.TransformColumns(#"Split Column by Delimiter", {{"Data", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Data"),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter1",{{"Data", type text}}),
#"Filtered Rows" = Table.SelectRows(#"Changed Type1", each Text.Contains([Data], "href")),
#"Split Column by Delimiter2" = Table.ExpandListColumn(Table.TransformColumns(#"Filtered Rows", {{"Data", Splitter.SplitTextByDelimiter("=", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Data"),
#"Changed Type2" = Table.TransformColumnTypes(#"Split Column by Delimiter2",{{"Data", type text}}),
#"Filtered Rows1" = Table.SelectRows(#"Changed Type2", each not Text.Contains([Data], "href"))
in
#"Filtered Rows1"
Best Regards,
Community Support Team _ Yingjie Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
I really doubt Power Query would cope with it with ease considering following cases in practice, CAPITAL letters, arbitary blank spaces, etc,
<A HREF = "http://www.google.com" REL="nofollow">Google link</A>
<a href="http://www.google.com" rel="nofollow">Google link</a>
<a href= 'http://www.yahoo.com' rel="nofollow">Yahoo link</a>
Resort to regular expression with embedded R or Python script using such a pattern,
(?<=href)(?:[^"\']*)(?P<DELIM>["\'])(?P<LINK>.*?)(?P=DELIM)
Thanks to the great efforts by MS engineers to simplify syntax of DAX! Most beginners are SUCCESSFULLY MISLED to think that they could easily master DAX; but it turns out that the intricacy of the most frequently used RANKX() is still way beyond their comprehension! |
DAX is simple, but NOT EASY! |
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
102 | |
53 | |
21 | |
12 | |
12 |