4

# can somebody tell me why a string has (n*(n+1)/2) substrings?

• 2
Sum of arithmetic series?
• 2
• 4
Because that is the formula.

CAT = 3 letters => so it has 6 possible different substrings.

You could just say, why not just do n^2 ? well because the start and end of the string shouldn't be counted twice.

So in the example 'CAT', you end up with only 2 characters => n^2 = 2^2 = 4 + start + end = 6

(This is of course very simply explained :D)

Edit : see link above :P Or go watch some numberphile videos on YouTube. They have one on that subject I believe.
• 0
@magicMirror yeah but that series has been used for finding sum between [1..N] inclusive but how this is useful for finding count of substring in a string.
• 6
@priyanshu-zeon

Think of it as an induction problem.

Start with a string of length 1, it has 1 substring (itself).

If you add a character, you get two new substrings, the character itself and the new string, so that makes 2 more

If you keep adding characters, everytime you extend to length n, you'll see you get n substrings added.

Therefore, the count of substrings in a string of length n is the sum of all numbers from 1 to n inclusive, which of course, is solved by the formula you have already been given.
• 1
Math
• 1
@phat-lasagna duhh, ofcource it's ducking math
• 0
@dotenvironment But that is not how the formula works with substrings.

Your string '123' will have 6 substrings :

'1'

'2'

'3'

'12'

'23'

'123'
• 0
@Grumm damn i was so fantastically wrong. precisely why i hate ds and algo. thanks for being polite, if this was reddit or twitter, this would have blown up with trolling :D
• 1
@Grumm
Thats Newton's Binomial theorem.
"cat" does not have "ct", or "tc" as substrings.
it does have these:
c, a, t, ca, at, cat = 6.
the given calculation is 3*(4/2)=6.
Induction is a way to prove correctness - but does not explain the idea behind the formula.
• 1
@priyanshu-zeon another attempt at eli5

a substring is a continuous part of a string . any string when cut from somewhere will make substrings. the size of each sustring after the cut somehow relates with the number of total unique substrings which somehow relates formula, therefore we use that formula.

eg for string,

- making 0 cuts:
abcd

- making 1 cut :
a,bcd
ab,cd
abc,d
a,bcd

- making 2 cuts
a,bc,d
ab,c,d
a,b,cd

- making 3 cuts :
a,b,c,d

therefore the various substrings can be written as

a,b,c,d, ab,bc,cd ,abc,bcd, abcd
= 4 substrings of size 1 + 3 subs of size 2 + 2subs of size 3 + 1 sub of size 4
= 4+3+2+1
= the answer for sum formula
• 0
@dotenvironment

That's not a mathematical formulation, as it becomes intractable for large n. You only gave a concrete example for n=3.

Proving a theorem by induction means that you prove it for n, and then for n+1 in terms of n.

If you can do that, then it follows that at least, it is valid for any natural number.

Going by your logic, Fermat's last theorem would hold for any n, when in truth, it's unknown for any n >= 3, because there is no way to express it for n=3 in terms of n=2.
• 0
Well the string is length 4 but can't edit. Point still stands though.

Please note that in not saying you are wrong, but as proof, your explanation is missing a hidden step, in which you manually remove duplicates.

A formal mathematical proof must be able to justify stumbling upon and removing those duplicates without requiring visual inspection.
• 0
@CoreFusionX yes i wasn't going around the mathematical Indic route. its just something i use to identify substrings ( and a slotted window analogy)
• 1
@magicMirror There are rules in place :

selecting 2 blank spaces atleast one letter apart.

a| b c | d = substring bc

| a b c |d = substring abc.

Now how many ways can you chose these 2 blankspace. For n letter word there are n+1.

Then first select one = n+1 ways

Select another (not the same)= n

So total n(n+1). But you have calculated everything twice. So n*(n+1)/2.

This has for me nothing to do with induction. It is just to prove the formula and it works for any string no matter how large it is.

You mention 'ct" or 'tc' but those aren't even possibilities.